Multi-bucket (Sharding) Primary Storage for NextCloud

Cloud-based object storage is one of the most cost-effective ways to achieve large quantities of scalable storage for a NextCloud instance. Thanks to NextCloud’s support for using an S3 (compatible) primary storage backend, this option is available from small businesses to enterprises of all sizes.

By default, the NextCloud data directory (where user data is stored) resides on a block device-based storage volume, which can be a physical disk on a bare metal server, virtualized block volumes in the cloud (e.g. Amazon EBS), or a managed NAS based on the NFS protocol (e.g. Amazon EFS).

Each of the options above have their own advantages and drawbacks, but the option for scalable NextCloud storage we will focus on in this article is object storage. The characteristics of object storage as primary storage for NextCloud include:

  • Low cost – Because the management (and maintenance) of object storage is completely determined by the cloud provider (e.g. Amazon, Microsoft, Google), the costs of the in-house engineers employed by the vendor can be amortized across the entire customer base to achieve scale – resulting in a low cost.
  • Scalable – Gone are the days of having a storage administrator add additional physical volumes (PV) to a Logical Volume Manager (LVM) volume group (VG) and resize the logical volumes (LV) to scale storage as your needs grow. If you are talking about warehousing a large volume of static data, the simplicity of object storage is difficult to beat.
  • Moderate performance (when using single bucket) – The main consideration with using certain object storage providers can be limitations on performance, especially when there are a large number of concurrent users who are trying to access different resources in the bucket. Some providers may impose API limits, especially per-bucket limits on API requests, which can cause temporary “503 Slow Down” upload/download errors in NextCloud. Fortunately, the performance limitations of using object storage as primary storage in NextCloud can be overcome using a multi-bucket (sharding) data architecture.

A multi-bucket primary storage configuration spans the data uploaded by your NextCloud instance’s users across multiple S3-compatible buckets, leading to improved performance – especially where the limiting factor was per-bucket API request limits. The number of S3 buckets you wish to span NextCloud data across can be configured by the admin.

Starting with NextCloud 12, the documentation of this feature is only available in the Customer Portal. To get access to the Customer Portal, you need to be a paying customer of NextCloud – which can only be achieved by licensing NextCloud Enterprise edition and taking out a support contract (starting from thousands annually).

Here’s a little known fact. Because the open source and enterprise edition of NextCloud share the same codebase, multi-bucket primary storage continues to be supported in the latest, stable releases of NextCloud Community edition (NextCloud 17.0.2 – as of January 2020). We tested multi-bucket primary storage with the latest release of NextCloud, as shown below.

NextCloud and Wasabi Object StorageIf you wish to use multi-bucket primary storage with NextCloud Community edition, contact us for a quote for NextCloud installation. There are no ongoing subscription fees (except the hosting cost) – only a one-time setup fee for us to configure and secure your server. Our server architects & engineers would be pleased to help your organization get set up.

Two Hidden Costs of Using Object Storage with NextCloud to Avoid

API Requests (PUT, COPY, POST, LIST and GET, SELECT) Cost

One of the major overlooked costs of using object storage as primary storage in NextCloud with certain object storage services is if API requests are billable. For example, Amazon charges US$0.005/1,000 API requests for objects stored in the S3 Standard storage class. While this may not seem like much, the cost of API requests often exceeds the US$0.02/GB/month storage cost for many NextCloud users (especially when a large number of small files are being stored).

Egress (Out) Bandwidth Cost

Another cost that is commonly overlooked are the egress bandwidth charges of both the object store and the cloud provider where you host your NextCloud application server(s).

All the major cloud providers including Amazon, Microsoft Azure, and Google Cloud charge US$0.10-0.12/GB of bandwidth transferred out of their public cloud, depending on the regions where the server and the destination are located respectively. Traffic within the same region (e.g. North America, Europe, Asia, and Oceania) tends to be less expensive, while cross-region traffic, especially to Asia and Oceania (i.e. Australia and New Zealand) is more expensive.

From the perspective of the NextCloud server, when a NextCloud user retrieves files, the files are downloaded from the object store, and uploaded to the end user. Conversely (again from the perspective of the server), when a NextCloud user puts files into their cloud storage, the files are downloaded from the end user and uploaded to the object store.

It gets worse. Even higher charges apply if you do not use an object storage service from the same cloud provider as where you’re hosting your server(s). For example, if you use Amazon S3 and host on Amazon EC2, then there is only a cost applied to the upload leg. If you used Amazon S3 and another provider, such as Azure, you would be billed for both the download leg from the object store (by Amazon) and the upload leg (by Azure) to the end user.

If you don’t carefully plan your architecture around this, you will face a severe cost penalty on both user uploads and downloads, which can completely blow the costs of administering your NextCloud deployment out of proportion. The effects of the above two overlooked costs cannot be understated.

The Solution: Wasabi Hot Storage and alternative cloud providers

Wasabi is a company that focuses exclusively on providing an inexpensive object storage service. Wasabi Hot Storage is their answer to Amazon S3 (or Microsoft Blob Storage and Google Cloud Storage), with no API request costs and 3x lower storage cost than S3. Yes, you read that right. Without hassling with less convenient storage classes such as Amazon Glacier, you can get an entire terabyte (1TB) of cloud storage for just US$5.99/month (US$0.0059/GB/month vs the industry standard of US$0.0200/GB/month). There are also no egress bandwidth costs from Wasabi, so any data transferred from Wasabi to your NextCloud server is not billed (a fair usage policy equating to the amount of your subscription applies).

For European customers concerned about data transfers outside of the EU/EEA/Switzerland due to data protection regulations such as the GDPR, Wasabi recently opened a region in the EU dubbed eu-central-1 in Ireland. Win-win.

The only egress charge left to cut out of the equation is the transfer from the NextCloud server to the end user (when a user downloads) and transfer from the NextCloud server to Wasabi (when a user uploads). Fortunately, many European cloud providers including Hetzner Cloud (Germany), OVH (France), and UpCloud (Finland) provide terabytes of bandwidth free included with their cloud servers.

If you are not based in the EU, for example in the United States, Australia or New Zealand, we have other cloud hosting with practically free bandwidth (DigitalOcean, Linode, etc.) and inexpensive object store providers that we can recommend as well.

Contact us for a recommendation of the best cloud to host your NextCloud Community edition instance.