Monday, August 27, 2012
If you've doubted the whole 'off-site storage' concept because of pricing, this will perhaps make you change your tune: At a mere 0.01€ per GB-month, you can store anything you want. The catch? The service is designed to make it really cheap to store data, but at the price of making it more costly to get your data back.
Here's why: uploading is free (except the cost of your own bandwidth) and immediate: there's no limit but the speed of the Internet connection. Once your data is uploaded, however, you won't see it again until you request it by way of a 'job'. Any sort of operation that you want to perform on your 'vault' (even listing its contents) is requested by such a job. Once queued, this job takes about 4 hours to complete. That's right. 4 hours. Why? I'll tell you in a minute.
This contrasts to S3, AWS's other low-cost storage solution, that is only 10 times as expensive. In Amazon S3, your uploaded files become immediately available. This is why providers like DropBox have built their application on top of S3. Combine S3 with CloudFront and you have a very potent web publishing platform.
But to make this work, Amazon has to build a gigantic storage area network: it has to buy the hard drives, but it also has to put them in servers, keep them powered and cooled. Which kept me pondering about Glacier. How do they make it work?
Here's how I think it works. Disclaimer: what follows is my opinion, and it's all conjecture. Amazon has the reputation of not going into much detail about how its services work behind the scenes, except when it's good marketing to do so (in the case of DynamicDB and its use of SSDs) or good public relations (in case of an incident, for example).
What is cheaper than keeping hard drives in a storage area network? It kept me up for awhile, but then it hit me: they're not keeping those hard drives online! They're probably using a system in which hard drives are detached from the network until needed, stored in a safe location.
Picture this: everytime you upload something, that goes onto a hard drive that is connected. Now, everybody is uploading all the time, so it's simply a matter of keeping enough hard drives attached and filling up as required to handle the upload demand. This is basically a streaming process: Amazon only needs to ID every drive (probably using QR codes or something) and keep details on what 'archive' (the single unit of upload) is stored on which drive.
When a disk is full, it gets detached and goes into storage. To increase durability, they probably do this for the same archive a couple of times in different locations (but always in the same region, to comply with their service agreement). This can either be done manually, or using some sort of robot. It doesn't necessarily have ot be hard drives either. Magnetic tape has evolved, but given today's hard drive prices, it may be far-fetched.
Given that the disk is now disconnected and put in a cabinet, it's pretty hard to get your data back. Which is where the jobs come in. When you put in a request to retrieve some of that data that you uploaded, your request goes into a queue. They may start looking for your drive the minute that request is received, or they may have to queue to cope with demand, but all of that doesn't matter: it's queued, so they can hire more workers or extra robots when they can't handle the queue in time anymore. Your disk comes out of the closet, onto a queue before it gets connected back to the system (a real treadmill might be involved too!) and when connected the archives that need to be fetched are placed into online storage and you're notified of the completion of the job and you can download your stuff. Hard drives that haven't been accessed for a predetermined period of time may be randomly put in the queue just to check them for integrity!
And that explains the 3.5 to 4 hour wait for retrieval requests to complete, and the bias of the cost towards retrieval (the pricing of which is, I must say, rather confusing).
Isn't it all just brilliant?
Next: how you can become your own Amazon Glacier using old hard drives you have laying around and a cabinet that you can lock.
Wednesday, August 22, 2012
Monday, July 30, 2012
So I've been thinking for awhile, why don't we simply put all of our paperwork in the cloud. The accountant, the bank manager, the architect, the construction company, all the documents that we've exchanged in the last year we've been able to do them by e-mail because at work we have this very large but very fast scanner that puts everything you throw at it on the network. Why not do something like this at home?
With our recent move into the new house, I replaced my aging Dell colour laserprinter and Canon photo printer with a Canon MX-895 multifunctional. Not only does it save space (that laserprinter was rather bulky) but it also comes with an document feeder and does two-side scanning of a bunch of papers that you put on it. From there it scans over the network to my home computer, where it's a simple matter of dragging & dropping those documents into a Chrome browser window in the proper Google Drive folder (I'm a Google Apps for Business customer).
Google does the OCR and makes your scans searchable and everything!
So I started to scan in all the incoming paperwork so far. Not much to begin with, I don't even know if I'll be scanning in older stuff, that would take ages, but two days into this and I already found I needed some obscure detail from the move to communicate with someone and found it waiting for me in my Google Drive, while in the office!
(...and people ask me why I'm all for the Cloud!)
Monday, May 14, 2012
Allow me to rant about this particular blogpost.
Well, there's probably more than one way to divide developers up. There are those that want to read code on paper and those that want to browse around it in their favorite development environment. Yes, those that want to give a particular revision of the state of the sourcecode a lasting impression on paper for all eternity still exist. I hope they're close to retiring or changing their minds.
First of all, the claim that Resharper "practically mandates" the use of var is completely false. You can set it either way, so if your team decides that 'var' is the next sign of a disintegrating civilization can simply set it the other way so that 'var' is always encouraged to be replaced by its proper type at the time of coding. On to the first point:
Implicitly typed variables lose descriptiveness
'var' encourages Hungarian notation
Specificity vs Context
Increased reliance on IntelliSense
No backward compatibility
Less reliance on types
Friday, January 06, 2012
- We'd like you to use our web app or social network instead and will annoy you until you do.
- Our app-review rules are always in someone's best interest.
- Anyone who wants a [popular new product category that Apple doesn’t make yet] should just curl up and die.
- Android is using other people's work.
- Don't look evil.
- We solicit all of your personal information and track everything you do because that's how we make money.
- Our brands want to push their products onto our users.
- We value your privacy as much as we think you do.
- We're not tracking you when you log out. We simply never log you out.
Thursday, September 22, 2011
Wednesday, April 27, 2011
Thursday, March 03, 2011
Today I found a better way to do it, thanks to this post on StackOverflow. You'll have to modify the steps that the poster follows, though, mostly because you can't log in as root on an Amazon Linux instance, but everything is available using sudo.
First you have to set up an instance (use a Large instance, you won't regret it), and the easiest way to do that is through the Amazon Management Console. Once you have your instance running and you can log in, do the following:
First, add the official Mono repository to yum. I'm quite new to yum, but the SO post made it quite clear. In your home directory, issue the command 'vi mono.repo', and press 'i' to enter 'insert mode', then paste the following snippet:
Then, press the 'escape' key, and type ':w' and ':q'. I'm not used to vi, but I know it's on there and this happens to work. Now you have the repo file in your home directory and you need to move it to the /etc/yum.repos.d directory, but only root can do that, so issue the command 'sudo mv mono.repo /etc/yum.repos.d'.
Next, you'll need to clear the cache using 'sudo yum clean' and install the Mono stack using 'sudo yum install monotools-addon-server'. That will install the latest stable version of Mono, currently 2.10.1.
You can do this on a Large instance and make it quicker, then shutdown your instance, make a snapshot, terminate the instance and make a new 'Micro' instance using the instance and you'll be set up. For some funny reason Micro instances are 64-bit and Large instances are too, but Normal instances are 32-bit only.
Now I have to figure out how to set up a decent web server so I can try to get ASP.NET MVC going.
UPDATE: I definitely needed the following to make ASP.NET MVC work: I had to install WCF, by issuing this command: ‘sudo yum install mono-addon-wcf’. The mod_mono module is autohosting, which means that when installed, it automatically loads so ASP.NET should work out of the box, and the example site is located at /opt/novell/mono/lib/xsp/test.
Saturday, November 27, 2010
It was Proximus, my mobile carrier. The nice voice on the other line started like this: "Hello sir, we've done an analysis of your mobile usage. This won't take long. Can I ask you, how much do you pay, on average, every month?".
I thought to myself, if they've done an analysis, they probably know more about my usage than me, since I rarely check my phonebill. I'll spare you the details, but everything else was a plot to trick me into signing up to a different plan that would stick for 18 months where I would pay a lot more than what I pay today. Her words: "that's not so bad, is it?"
Hmmm..... no thanks.
Friday, November 19, 2010
- Read a bunch of .NET related books. From each book, try to remember the stuff you feel will make you a better developer, and forget the rest.
- Read a bunch of programming books. From each book, try to remember the stuff you feel will make you a better developer, and forget the rest.
- Learn Unit Testing, so that when that day comes when your senior developer or manager coerces you into using it that at least you can defend yourself if you feel it’s not what you need.
- Idem dito about Continuous Integration, Cloud Computing, ORMs, Scrum, BDD, DDD, TDD, EDD, ServiceBus, Messaging Architectures, CQRS, IoC Containers.
- Keep yourself up to date by reading popular blogs and following popular people on Twitter. Many of these are biased or outright zealots, but that’s ok because you should follow all sides at once. While you do, keep an eye out for “the next big thing” and keep a ton of salt ready.
- Realize that, if you’re not a passionate developer, there’s not much use in reading these tips and you probably won’t ever become a better developer. If that makes you unhappy, find a different career.
- Learn about version control systems. Regardless of how many you know already, there may be more that you’ll never ever encounter, but it’s nice to see just how green the grass is on the other side. It may be greener, but it’s probably browner once you’re past the marketing fluff.
- Realize that, if you’re a passionate developer, you probably know most of these “tips” already, or at least you’re convinced you did and you’re just scanning over them to see if there’s any new acronyms in there that you didn’t hear about yet. Good boy.
- Listen to whatever Anders Hejlsberg says. He’s the next best thing to God himself. If he’s in town, clear your schedule.
- Realize that every list of tips you read is written by a passionate developer, who has his own personal beliefs that may cloud his judgement from time to time when he prepares such list of tips. Such as yours truly.