What does a set of screwdriver bits in common with the AWS SDK for PHP?
Surely you have been in this situation before: You only need one particular screwdriver, but since you can’t get them separately, you have to buy a full set. Well, what can you do, and who knows, maybe you’ll need a second one in the future?
Same with the PHP SDK for AWS. >ou may only need one or two services in your project right now, but you still have to install the full SDK with an API for almost every AWS service that exists.
But it’s free, so what’s the problem? Just download the bunch and forget about the rest, right?
Nothing comes for free
First of all, it’s a lot of code. Over 40 MB of plain PHP code, spread across 4000 files. That might be acceptable for a monolith (well, no, not really), but what about a lambda function? For me, it kills the idea of a Lambda if a function has to be shipped with thousands of lines of code just to connect to DynamoDB, for example. I can’t even edit the code in the AWS console because it’s too big for the editor.
And it gets updated a lot due to the sheer number of APIs it contains, often twice a week or more. It’s great that it’s well maintained, but a lot of updates are for code you don’t use. It’s like buying a new bit set, just because one of the other bits has changed.
Second, and more importantly, there’s the carbon footprint of all this. How do we estimate the emissions caused by the data transfer?
Let’s do some number crunching. The AWS SDK for PHP has about 200k daily installs, according to Packagist.org. Gzipped, it’s still about 3.7 MB of data transferred. Let’s see what
Carbon emissions of data transfer
So what’s the environmental impact of that? If we follow the Sustainable Web Design methodology and assume that 1 GB of network traffic emits about 350g of carbon (0.81 kWh/GB × 442g CO2e/kWh), we end up with over 210 kg of carbon emissions per day. Doesn’t sound so much, but it sums up to over 75 tonnes of carbon per year – for code that we don’t need!
If we assume that 3 MB of the 3.7 MB is actually just unnecessary compressed code per download in most cases, we have 200k × 3 MB × 365 = 219 TB of data transfer per year that could have been avoided.
Of course, we can debate whether the methodology used here makes sense. It’s based on estimates for the global network, but we have to start somewhere. Even if the numbers are not correct, I think they serve well to illustrate the issue. Please don’t hesitate to provide feedback if you think the approach I chose here is not correct.
What would be a solution?
Ideally, AWS would have to break the SDK into smaller pieces so that you could only include dependencies for the services you really require. If you prefer not to wait for miracles, you can use the great Async AWS project [3], which already does this.
For me, this is a perfect example of what’s wrong with software development. We have become lazy, and we can simply compensate for this laziness with more resources. This thinking leads to “overweight” software and even more resource consumption.
On the other hand, however, it shows that there is great potential to reduce the environmental impact of IT.
What do you think? Is this a problem worth thinking about? And do you generally care about the size of the dependencies that are used in your projects?