Allow instances with high disk bandwidth
Data processing workloads usually bounded on disk IO.
Currently google provides expensive internal SDD disks that are bounded by size (3TB) or external disks that are bounded by 240MB/s limit per instance.
AWS on the other hand allows internal hdd disks 48TB with 6GB/s performance (d2.8xlarge) or cheap persistent hdd disk that allow 1250MB/s bandwidth per instance. In any way, EC2 has much bigger offering for data processing workloads.
Thanks for your feedback, we’ll consider an HDD-based instance storage in future product planning.
Please refer to the comments on this entry for more details about how our Local SSD product works and what it costs. I believe you’ll find it superior to the similar offering from AWS (i2), but we don’t currently have an exact equivalent that is HDD-based (d2).
Thanks Roman, this additional feedback is very helpful to us as we consider how to move forward with our product offerings. I'm sharing with our colleagues on the storage team as well, and we'll definitely take your thoughts into consideration in our planning. As far as limits go, we have increased them in the past, so it's possible we would do so in the future. Something like a d2 would be a little bit different, as it involves hardware, but also something we'll keep in mind.
Roman Gershman commented
btw if I would prioritize, I think it makes more sense to provide something similar to d2 offering than providing st1/sc1 disks. We do not need external disks and 3xreplication for the batch jobs and it inflicts huge performance penalty of course.
Roman Gershman commented
You are right that with internal SDDs you are competetive. However SSDs are not required for "classic" data processing workloads (like mapreduces) since they usually do not require random IOPS.
Our data processing infrastructure leverages the fact that it's possible to store and read huge chunks of data (> 10TB) very fast from a single node . Google cloud does not have such capability at all ( see below another limit). That practically blocks us from transferring data processing infrastructure from AWS to google cloud.
Please also consider raising the limit per instance for persistent disks. See http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html for allowing 1250 MB/s per instance for st1/sc1 drives.
A clarification: our "expensive" internal SSD based instance storage, what we call Local SSD, is not bounded by 240MB/s. The limit you mention is for SSD-based "PD" (durable block storage, similar to EBS). Please refer to  for more information.
You're correct, that we do not currently offer a locally attached HDD product for these types of workloads. To my knowledge we find that most customers are moving towards flash-based ephemeral storage for these types of workloads, but we'll certainly take your feedback into consideration when planning future hardware offerings.
Finally, I would encourage you to check into the "expensive" statement. An i2.4xlarge with 16 vCPU and 53GB of RAM and 4x800SSD is priced at $3.41 at AWS, meanwhile a similar n1-standard-16 with 60GB RAM and 8x375GB SSD (about the same amount...) is only $1.70 per hour on GCP, and that is without applying up to 30% automatic Sustained Use Discount if you run it most of the month. For this lower price, note that SSD on GCP is also quite a bit higher max performance in IOPS than AWS. Our pricing calculator  can help lay out the costs for you. It is of course true that SSD costs more than HDD in general, but this is a universal truth.