Monday, August 27, 2012

How Amazon Glacier works (or at least, in my mind)

A week ago Amazon's Web Services division has bestowed us with another game-changing product that made me so exciting I couldn't sleep: Amazon Glacier. If you don't know what it is, I can sum it up for you in one line: Amazon Glacier is a new Web Service that allows you to store your data safely and securely remotely in cold storage at rock-bottom price.

If you've doubted the whole 'off-site storage' concept because of pricing, this will perhaps make you change your tune: At a mere 0.01€ per GB-month, you can store anything you want. The catch? The service is designed to make it really cheap to store data, but at the price of making it more costly to get your data back.

Here's why: uploading is free (except the cost of your own bandwidth) and immediate: there's no limit but the speed of the Internet connection. Once your data is uploaded, however, you won't see it again until you request it by way of a 'job'. Any sort of operation that you want to perform on your 'vault' (even listing its contents) is requested by such a job. Once queued, this job takes about 4 hours to complete. That's right. 4 hours. Why? I'll tell you in a minute.

This contrasts to S3, AWS's other low-cost storage solution, that is only 10 times as expensive. In Amazon S3, your uploaded files become immediately available. This is why providers like DropBox have built their application on top of S3. Combine S3 with CloudFront and you have a very potent web publishing platform.

But to make this work, Amazon has to build a gigantic storage area network: it has to buy the hard drives, but it also has to put them in servers, keep them powered and cooled. Which kept me pondering about Glacier. How do they make it work?

Here's how I think it works. Disclaimer: what follows is my opinion, and it's all conjecture. Amazon has the reputation of not going into much detail about how its services work behind the scenes, except when it's good marketing to do so (in the case of DynamicDB and its use of SSDs) or good public relations (in case of an incident, for example).

What is cheaper than keeping hard drives in a storage area network? It kept me up for awhile, but then it hit me: they're not keeping those hard drives online! They're probably using a system in which hard drives are detached from the network until needed, stored in a safe location.

Picture this: everytime you upload something, that goes onto a hard drive that is connected. Now, everybody is uploading all the time, so it's simply a matter of keeping enough hard drives attached and filling up as required to handle the upload demand. This is basically a streaming process: Amazon only needs to ID every drive (probably using QR codes or something) and keep details on what 'archive' (the single unit of upload) is stored on which drive.

When a disk is full, it gets detached and goes into storage. To increase durability, they probably do this for the same archive a couple of times in different locations (but always in the same region, to comply with their service agreement). This can either be done manually, or using some sort of robot. It doesn't necessarily have ot be hard drives either. Magnetic tape has evolved, but given today's hard drive prices, it may be far-fetched.

Given that the disk is now disconnected and put in a cabinet, it's pretty hard to get your data back. Which is where the jobs come in. When you put in a request to retrieve some of that data that you uploaded, your request goes into a queue. They may start looking for your drive the minute that request is received, or they may have to queue to cope with demand, but all of that doesn't matter:  it's queued, so they can hire more workers or extra robots when they can't handle the queue in time anymore. Your disk comes out of the closet, onto a queue before it gets connected back to the system (a real treadmill might be involved too!) and when connected the archives that need to be fetched are placed into online storage and you're notified of the completion of the job and you can download your stuff. Hard drives that haven't been accessed for a predetermined period of time may be randomly put in the queue just to check them for integrity!

And that explains the 3.5 to 4 hour wait for retrieval requests to complete, and the bias of the cost towards retrieval (the pricing of which is, I must say, rather confusing).

Isn't it all just brilliant?

Next: how you can become your own Amazon Glacier using old hard drives you have laying around and a cabinet that you can lock.

Wednesday, August 22, 2012

ASP.NET Web API: Customizing the JSON representation for every request.

Returning JSON in response to a web request is what we all love to do, right? Except of course there's a serious security problem in returning arrays in JSON. So serious, that the ASP.NET MVC team decided to add a 'feature' that requires you to write in your explicit content that you're return JSON for a GET request.

Phil Haack called in JSON hijacking in his blog post. The obvious work around is to wrap your array in an object. Cool! 

Now I want to do this in ASP.NET MVC Web API. Except of course, I don't want to change my model for this (there's nothing wrong with returning XML, well nothing to do with this security threat anyways) so what I was looking for was a way to customize the way JSON was created. 

The solution is to create a class that inherits from JsonMediaTypeFormatter, which is the class responsible to support the JSON media type. I'm calling it MyJsonMediaTypeFormatterbecause it's .. mine. 

as you can see, I'm only overriding one method, and I'm not doing too much either. Simply wrapping the value in an anonymous object and letting the base class do all the heavy lifting.
Now onto configuration. I simply add the following lines to my Application startup:

And presto, every JSON response that is handled through ASP.NET Web API is wrapped in an object!

Monday, July 30, 2012

All my documents in the cloud.

Paperwork is such a hassle. I can organize my home network, file shares, digital media and business, but my office has always looked as if a bomb exploded and I've got plenty of stories of documents that were in there somewhere, but when needed they were nowhere to be found. The last couple of years taught me that if I ever received an invoice electronically, if the accountant sent me an e-mail that some payment (usually for an online purchase) was not covered by said invoice, I would find it in a jiffy and send it to him electronically.

So I've been thinking for awhile, why don't we simply put all of our paperwork in the cloud. The accountant, the bank manager, the architect, the construction company, all the documents that we've exchanged in the last year we've been able to do them by e-mail because at work we have this very large but very fast scanner that puts everything you throw at it on the network. Why not do something like this at home?

With our recent move into the new house, I replaced my aging Dell colour laserprinter and Canon photo printer with a Canon MX-895 multifunctional. Not only does it save space (that laserprinter was rather bulky) but it also comes with an document feeder and does two-side scanning of a bunch of papers that you put on it. From there it scans over the network to my home computer, where it's a simple matter of dragging & dropping those documents into a Chrome browser window in the proper Google Drive folder (I'm a Google Apps for Business customer).

Google does the OCR and makes your scans searchable and everything!

So I started to scan in all the incoming paperwork so far. Not much to begin with, I don't even know if I'll be scanning in older stuff, that would take ages, but two days into this and I already found I needed some obscure detail from the move to communicate with someone and found it waiting for me in my Google Drive, while in the office!

(...and people ask me why I'm all for the Cloud!)

Monday, May 14, 2012

He's got it all wrong

I was masterfully lured into reading someone's blogpost today, about the proper use of 'var' keyword. This is a topic that stays hot, in the community and on the work floor. The reason that this topic stays hot is that there's basically two camps of developers: those that want to read what a bit of code wants to do, and those who want to read how it's done.

Allow me to rant about this particular blogpost.

Well, there's probably more than one way to divide developers up. There are those that want to read code on paper and those that want to browse around it in their favorite development environment. Yes, those that want to give a particular revision of the state of the sourcecode a lasting impression on paper for all eternity still exist. I hope they're close to retiring or changing their minds.

First of all, the claim that Resharper "practically mandates" the use of var is completely false. You can set it either way, so if your team decides that 'var' is the next sign of a disintegrating civilization can simply set it the other way so that 'var' is always encouraged to be replaced by its proper type at the time of coding. On to the first point:

Implicitly typed variables lose descriptiveness

That's a bold claim. Just because the type name provides "an extra layer of description" doesn't mean that it's useful. That type name just states the obvious. The real description is in the naming of the variables, and the methods that are used. Example:

5// how is it clear to the reader that I can do this?
6return individuals.Compute("MAX(Age)"String.Empty);

The answer? It's not relevant. The compiler decides what can and can't be done. Intellisense will inform the user while typing up that code. What is the reader trying to verify?

'var' encourages Hungarian notation

I think this is absurd. If a developer really wants to show the type of a variable he shouldn't be using 'var' at all. 

Specificity vs Context

Same point different paragraph. 

11// you can't blame the programmer for making this mistake

Yes, yes you can. You can blame him for submitting a piece of code that doesn't compile.

Increased reliance on IntelliSense

This might be a valid point. But then again, if I don't have IntelliSense, I wouldn't be as productive as I am today. Heck, whenever I have to type up a piece of code in Notepad I cringe. I'm sure that using 'var' doesn't help, but I've got lots more to worry about when away from my IDE.

No backward compatibility

Well, for starters, C# 3.0 and .NET 2.0 are unrelated. One is a compiler version, the other is a framework version. You can write all your .NET 2.0 compatible code in C# 3.0 and use all the language features of that compiler. It will run just fine. But of course, if you want your code to compile on the older C# 2.0 compiler, then you're screwed. But I guess in that case not using 'var' is not going to help you: no lambda's, no extension methods, no anonymous types, no initializers, and so forth. If you're going to target the C# 2.0 compiler, better plan out from day 1 and think back about the goold ol' year 2005 where LINQ was still a wet dream.

So what does the use of 'var' buy us?

Less noise

No more "Dictionary<string, string> dictionary = new Dictionary<string, string>()". 

Less reliance on types

Without 'var', every time you decide that you need to abstract some portion of your code or add another layer of indirection or do some refactoring, chances are you'll be replacing a lot of types all over your code. If your lucky, your favorite VS add-in will guide you through it, but I've had more than my share of compiler errors because somehow somewhere I still have a variable using the 'old' type instead of the new-and-improved type. 

This is where the dynamic language boys are laughing in our face. They claim that their language doesn't impose this sort of "red tape" and then move on to why unit testing is so important while in the statically typed language there's a compiler that removes half of the mistakes we make before we even run our code.

Closing remarks

That's it, I've burned through the rage that made me write up this rant. In short, I'd like to encourage you to embrace your compiler and use all the features that it provides to let you focus on writing the most conscice and elegant code that you can without sacrificing code quality. To me, that means removing all the cruft wherever possible and rely on the tools to provide me with the context needed to wade through my code.

Friday, January 06, 2012

Marco has a point. When you read his blogpost, this is how you should actually read it:
  • We'd like you to use our web app or social network instead and will annoy you until you do.
  • Our app-review rules are always in someone's best interest.
  • Anyone who wants a [popular new product category that Apple doesn’t make yet] should just curl up and die.
  • Android is using other people's work.
  • Don't look evil.
  • We solicit all of your personal information and track everything you do because that's how we make money.
  • Our brands want to push their products onto our users.
  • We value your privacy as much as we think you do.
  • We're not tracking you when you log out. We simply never log you out.

Thursday, September 22, 2011

Building a house!

The contract has been signed.
That's it for now. Later I'll tell you all about the house I'm building. As a teaser: I'm programming the PLC to do the home automation myself. It isn't much, but it's geeky enough to talk about.

Wednesday, April 27, 2011

Techdays, and a movie afterwards

Today was the second day of the TechDays Belgium 2011 conference. It was a blast, and I learned a bunch of new stuff that will keep me busy over the next couple of months as I begin to digest the information overload that I experienced over the last couple of days.
All the editions that I've participated in so far have been organized at the Metropolis in Antwerp. It's a movie theatre complex from the Kinepolis Group that I particularly like for their wide range of theatres and predictable service and choice of catering, and quality of their projections. Using it as a venue for a development conference pays off as you get to use the high quality projectors and screens and audio to do what a conference does best: show stuff to people. The catering is not bad either and all in all it's a wonderful experience that's very well handled by their staff. I'm sure tons of stuff went wrong behind the scenes that us "devs" will never even realize has happened because of the professionalism with which such as big event is organized.
Which leaves me with a sad note. We were all given some nice swag to go along with the conference, like a bag with goodies (mostly marketing junk, but the bag is nice) to put the stuff in that you trade your soul for (as in: have your badge scanned and therefore agree to be spammed the remainder of your days). I also like to take the opportunity of attending this conference each year by finishing it with a visit to a movie. Except, this time I was met with the "other" side of the theatre.
Even though there were like a 1000 guys like me walking around the complex with this issued bag for two days, I was promptly denied access to the theatre because this "bag" was deemed a security risk. It doesn't matter that this particular complex as a vast underground parking area from which one could easily plant some sort of explosive device that could do some serious damange. No, sir, your bag is a security risk, and we can't allow you to enter unless you're willing to part with your proudly received swag for a couple of hours in one of our fine locker rooms.
It's not so much that I was discomforted by not being able to carry my bag inside (which contained the Oatmeal book that I planned to read during the commercial breakdown) but the fact that I felt seriously violated. Here I paid 9 euros to enter a theatre to watch a movie ("The Adjustment Bureau" and it's a nice one) and even before I entered the room I was assured to be treated like a criminal. And yet they keep wondering why they're losing customers. Apparently being subjected to anti-piracy campaigns before the start of the movie wasn't enough. What's next? Strip searches? Full body scanners? Uniformed company drones with blue gloves?
I am truly disgusted.

Thursday, March 03, 2011

Setting up Mono on Amazon EC2

A while ago I managed to set up Mono on an Amazon EC2 instance running the standard Amazon Linux AMI, by compiling from sources. Boy, that was a pain, took a while (I was stupid enough to do it on a Micro instance) and turned out to be totally unnecessary.

Today I found a better way to do it, thanks to this post on StackOverflow. You'll have to modify the steps that the poster follows, though, mostly because you can't log in as root on an Amazon Linux instance, but everything is available using sudo.

First you have to set up an instance (use a Large instance, you won't regret it), and the easiest way to do that is through the Amazon Management Console. Once you have your instance running and you can log in, do the following:

First, add the official Mono repository to yum. I'm quite new to yum, but the SO post made it quite clear. In your home directory, issue the command 'vi mono.repo', and press 'i' to enter 'insert mode', then paste the following snippet:

name=Mono Stack (RHEL_5)

Then, press the 'escape' key, and type ':w' and ':q'. I'm not used to vi, but I know it's on there and this happens to work. Now you have the repo file in your home directory and you need to move it to the /etc/yum.repos.d directory, but only root can do that, so issue the command 'sudo mv mono.repo /etc/yum.repos.d'.

Next, you'll need to clear the cache using 'sudo yum clean' and install the Mono stack using 'sudo yum install monotools-addon-server'. That will install the latest stable version of Mono, currently 2.10.1.

You can do this on a Large instance and make it quicker, then shutdown your instance, make a snapshot, terminate the instance and make a new 'Micro' instance using the instance and you'll be set up. For some funny reason Micro instances are 64-bit and Large instances are too, but Normal instances are 32-bit only.

Now I have to figure out how to set up a decent web server so I can try to get ASP.NET MVC going.

UPDATE: I definitely needed the following to make ASP.NET MVC work: I had to install WCF, by issuing this command: ‘sudo yum install mono-addon-wcf’. The mod_mono module is autohosting, which means that when installed, it automatically loads so ASP.NET should work out of the box, and the example site is located at /opt/novell/mono/lib/xsp/test.

Saturday, November 27, 2010

Dear Proximus, tell your telemarketeers to stop lying in my face.

I had a call, the other day. It was a blocked number. I don't usually answer blocked numbers. They're usually telemarketeers. I hate telemarketeers. I don't want to answer the phone when it's a telemarketeer. But you're not really sure, right? So they called again, the day after. I answered, figuring that if I didn't answer, this would go on and on.

It was Proximus, my mobile carrier. The nice voice on the other line started like this: "Hello sir, we've done an analysis of your mobile usage. This won't take long. Can I ask you, how much do you pay, on average, every month?".

I thought to myself, if they've done an analysis, they probably know more about my usage than me, since I rarely check my phonebill. I'll spare you the details, but everything else was a plot to trick me into signing up to a different plan that would stick for 18 months where I would pay a lot more than what I pay today. Her words: "that's not so bad, is it?"

Hmmm..... no thanks.

Friday, November 19, 2010

10 steps to become a better .NET developer

In response to this blog article, I'd like to present to you my very own list of tips to become a better .NET developer.

  • Read a bunch of .NET related books. From each book, try to remember the stuff you feel will make you a better developer, and forget the rest.
  • Read a bunch of programming books. From each book, try to remember the stuff you feel will make you a better developer, and forget the rest.
  • Learn Unit Testing, so that when that day comes when your senior developer or manager coerces you into using it that at least you can defend yourself if you feel it’s not what you need.
  • Idem dito about Continuous Integration, Cloud Computing, ORMs, Scrum, BDD, DDD, TDD, EDD, ServiceBus, Messaging Architectures, CQRS, IoC Containers.
  • Keep yourself up to date by reading popular blogs and following popular people on Twitter. Many of these are biased or outright zealots, but that’s ok because you should follow all sides at once. While you do, keep an eye out for “the next big thing” and keep a ton of salt ready.
  • Realize that, if you’re not a passionate developer, there’s not much use in reading these tips and you probably won’t ever become a better developer. If that makes you unhappy, find a different career.
  • Learn about version control systems. Regardless of how many you know already, there may be more that you’ll never ever encounter, but it’s nice to see just how green the grass is on the other side. It may be greener, but it’s probably browner once you’re past the marketing fluff.
  • Realize that, if you’re a passionate developer, you probably know most of these “tips” already, or at least you’re convinced you did and you’re just scanning over them to see if there’s any new acronyms in there that you didn’t hear about yet. Good boy.
  • Listen to whatever Anders Hejlsberg says. He’s the next best thing to God himself. If he’s in town, clear your schedule.
  • Realize that every list of tips you read is written by a passionate developer, who has his own personal beliefs that may cloud his judgement from time to time when he prepares such list of tips. Such as yours truly.