Cloud Caper

What I learned about trying to run my own cloud from a few weeks of trying to run the whole dang thing myself. (Hint: I found myself trying multiple solutions.)

Today in Tedium: About a month ago, I got this wacky idea in my head. After one too many frustrating headaches with the cloud, I decided, well, I was going to show them. As anyone who is relatively normal probably doesn’t know and anyone who isn’t has probably known for years, it is possible to run a cloud service on your computer. There is software out there that is designed to do just that. And well, you can make it your own. But, having spent time trying to do this for a little while, I’ve started to realize why normal people don’t do this. It’s really hard to get started. So let me, as an average person, try to explain this to you. How can you self-host your life, no middle-man needed? In today’s Tedium, let’s discuss the good, the bad, and the ugly of self-hosting your own cloud setup. — Ernie @ Tedium

Today’s Tedium, fittingly, is sponsored by the cloud security service Nightfall. More from them in a second.

“For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.”

— The single most infamous comment in the history of Hacker News, left on the original Show HN for Dropbox, back in 2007. No, it did not age well. Yes, there are better ways to do this. But when self-hosting your own cloud software, this is the risk of what you might sound like. (I’m pointing this out now as a way to keep myself honest.)

(Pawel Nolbert/Unsplash)

The key thing about rolling up your own cloud service is that you need to understand what your ultimate goals are

When taking on a project as complicated as running your own version of Dropbox, you need to have an understanding of both what’s out there from a technical standpoint, how your own equipment can help you meet your goals, and what you might give up in the process.

Plus, you have to decide whether you’re going to run your cloud server on a local machine, or on existing cloud infrastructure.

For me, I had three major goals:

  1. Make it easy to access and edit my writing anywhere. I have years of articles, all written in Markdown, and I would like to be able to edit them on my phone, if a random inspiration strikes me. While my primary editor iA Writer is good at doing this on the Mac, I have found it far from suitable for the job on mobile, in part because of syncing issues that have made life somewhat difficult. Part of the reason I started looking for other options was because Google Drive fell down on the job of keeping my writing updated in all places.
  2. Allow me to work across multiple machines on a daily basis. On an average day, I move primarily between three different machines for different tasks: my primary machine, a MacBook Air; a 2019 MacBook Pro work machine that I try to limit my personal cloud use on (making a web interface desirable); and a 2012 MacBook Pro that has a matte screen that I like for more focused writing. I also have a number of other machines that I use less often, including a decade-old Xeon workstation I use primarily as a home server and that I generally remote into in a headless format, a couple of tablets, and a 2017 HP Spectre x360 that I use primarily as a Linux machine these days (though the Hackintosh flame still burns). I want to be able to efficiently access my stuff on all of these machines.
  3. Potentially, save a little money. Being a Google or Dropbox user is not the most expensive thing in the world, but you end up giving someone your money that you don’t necessarily have to. And honestly, there’s the trust factor to it—you have to be willing to put Google and Dropbox in a position of trust, and in the case of Dropbox, at least, I felt like the company had lost my trust.

Now, I’ll be realistic here—telling other writers to come edit documents with me on cloud storage seems like a bit of a tough sale, so I fully admit that I’m going to stick with Google for things like email and document editing-style things where I need to share with a friend. But I could invite someone to a locally hosted cloud if I so wanted.

Some other things that I considered important for me included the ability to use integrations to help automate processes like file upload, something I traditionally have used Zapier for.

Additionally, there was the debate about where to host this thing. One concern I have is, well, if I’m self-hosting, do I try do so locally, knowing the not-unrealistic odds of a power outage knocking my stuff offline, or do I take my chances with a low-cost cloud platform like Vultr or DigitalOcean? And do I host the files locally, on a VPS server, or rely on object storage from a cloud file hosting platform, like Amazon’s S3?

One interesting angle of the S3 approach: In recent years, S3-compatible cloud storage platforms have emerged as viable alternatives to Amazon Web Services, and their costs are such that it’s actually somewhat reasonable to Dropbox or Google Drive … if it works. Two that come to mind for this use case are Wasabi and Backblaze’s B2, which each charge less than $6 a month for a terabyte of storage. (I’d be remiss if I didn’t mention that Backblaze’s primary product is actually a good alternative to Dropbox for backing things up, if not sync capabilities.)

Theoretically, if I can find the right cloud storage platform and the right VPS, I could get a result that’s cost-competitive with both Google Drive and Dropbox … if it works.

So, here’s what I learned.

What is monitoring your monitoring?

Secrets and sensitive information easily end up in logs in many different ways—find out more here. Learn what data ends up in your logs and improve your security and privacy with Nightfall: a flexible, accurate data protection solution. Let Nightfall focus on the data security so you can focus on building and scaling your core applications.

2016

The year that NextCloud, a self-hosted cloud platform produced under an open-source license, forked from ownCloud, a similar platform that is largely open-source, but blocks off some of its features for only enterprise users in a proprietary use case. NextCloud is developed by Frank Karlitschek, one of the original ownCloud developers, along with other members of that original team.

The backend of NextCloud. I’m not showing my actual backend, because reasons. (via NextCloud)

NextCloud has a nice interface, but it’s not for everyone, especially if you don’t want to spend your weekends hunting down bugs

I started this adventure with NextCloud, a service that has gotten a lot of attention from the open-source community for its ability to do a lot of very versatile things.

A fork of the enterprise-focused option ownCloud, the app is a standby for many Linux users, and even has an advantage over clients such as Dropbox, as it’s had a native Apple Silicon version for a while, and its web interface can be expanded into a full office suite. (It’s also natively supported by GNOME-based Linux distros, which is a nod in its favor.)

It has a lot of great applications available for it, including a pretty good Markdown editor that is web-based and works well in mobile settings where you’re just trying to access your content.

And as I started on this adventure, I looked at a few solutions. There is actually a fairly low-cost NextCloud host out there, Hetzner, but its servers were all hosted in Germany, meaning that they’d be taking a long trip in my use case. (But if you’re Europe-based, you may find it a good solution for you!) Other NextCloud-dedicated hosts varied significantly in price, and were generally well above the cost of a comparable commercial solution.

I liked the ability to “white label” NextCloud to my heart’s desire. Here’s what my login screen looks like.

So I landed on another approach: I tried running it on a Vultr instance, with the hopes of keeping it active full-time. I attached it to a Wasabi storage instance, which the company helpfully offers on a trial basis for testing purposes, and together I had my own personal cloud ready to go. I could even customize it to my heart’s content, adding in my own visual look, white-labeling the web interface and switching to a “dark mode” for writing.

But eventually, I ran into problems that made me question whether it was the right choice:

  1. Challenges with syncing. While the app was able to handle most files, anything over a certain size—I’d say, over 200 megabytes in size—the file would fail to sync with the provided web clients. Resolving this problem would require a lot of server optimization, and even with my own tinkering with the server based on NextCloud’s own docs, I found nothing I tried to be foolproof—even when running a version of the server on a local Docker client with ample memory. (No real difference whether or not I used S3, either.) I grew concerned about whether the problem was with the sync clients or the server itself, especially after I was able to successfully upload the same large files to the Web interface.
  2. Memory consumption. To me, it felt like Vultr really required more resources than the lowest-end solution to really be usable for long periods. And that meant a delicate balance between cost and usability. You could definitely feel it struggling to meet its modest needs even with two gigabytes of RAM. Even trying to run this with faster server technology, such as the Redis memory-caching solution on my Xeon (which has more than enough RAM to go around), seemed to allow for imperfect results.
  3. Lots of under-the-hood tweakage. Now, I like tweaking things under the hood, but I think there are definitely times where it can get a little far. And at some point, I came to the realization I was spending all weekend attempting to optimize code that was imperfect for my use case. (Does that make me bad at going under the hood? Perhaps! But I also think NextCloud kind of sells itself as a plug-and-play solution, hence my frustration.)

The most discouraging part, honestly, was the fact that when I searched through forums for assistance, they were often filled with people like me that seem to have run into similar roadblocks with NextCloud. It made it seem like I was far from alone here, especially when the suggested solutions seemed to have little effect.

For newbies looking to do this, I highly recommend starting slow and not trying to treat this as a full Dropbox replacement right off the drop. NextCloud is a really good self-hosted office suite selling itself as a cloud hosting tool, in my view. (I hope this feedback helps it become a little better at the cloud-hosting part, honestly. It would be nice if there was an easy way to tweak these settings within the NextCloud web interface, just as an example.)

An example of the Pydio Cells admin interface. (via Pydio)

A few other cloud-adjacent applications that I tried on my big adventure in self-hosting

NextCloud’s deficiencies, at some point, made me realize that I needed to cast my net a little wider to understand some of the technical needs I had. So I tried a number of other cloud services that work in a self-hosted format.

(Generally, my approach was to spin up local Docker instances for each of these, rather than going through an intermediary like Vultr. I either ran these through an NGINX proxy or through Caddy.)

Among them:

  1. Filerun. This had a pretty elegant interface and was well-suited for basic file storage that could be accessed from anywhere. (It also works with the NextCloud sync client, which is useful if you’re messing around with multiple services, as I was.) But it’s not open-source, but rather a loss-leader for a proprietary version, and doesn’t support S3 file storage, so I felt the headroom for growing with it was a bit limited.
  2. Pydio Cells. This solution wasn’t bad—it was much more stable than NextCloud, for one thing—though, like ownCloud, it locks some of its most interesting features, such as the Zapier competitor Cells Flows, away for those who rely on its proprietary commercial use cases. Despite some initial challenges with setting up the Docker container to my liking, I liked this and almost went with it; I also feel like the self-hosting community might be sleeping on this one. (That said, I’d be more likely to recommend it if it had a native Apple Silicon sync app.)
  3. Seafile. This was described to me as a file-storage-emphasizing take on NextCloud, and I think that’s a good description, though it has a few pretty decent apps. Based on C and Python, it was more stable than the PHP-based NextCloud. And while it has a proprietary version, it’s available in full to individual users like myself. So what stopped me? Well, at this time, its M1 sync app requires changes to the security settings of your machine, and honestly I have too many bad memories of recovery mode at this time. (That’s not really on Seafile so much as Apple, IMHO.)

These solutions are all good, and there are others out there as well that I didn’t try, such as the drop-in Google Drive alternative myDrive. It’s sort of like content management systems—if you want to keep digging, there’s plenty of room to do so.

But ultimately, when it came down to it, I went with something simpler. I even surprised myself.

A screenshot of Syncthing in action. I’d show you what mine looks like, but, you know, privacy. (via Syncthing)

Syncthing … just … works!

At some point, having all these headaches with all these other services, I sort of reassessed some of my thinking here.

OK, so my goal is to wean off of cloud and hosting services that I pay for, with the goal of not paying so much money for the tools that I access. Additionally, I want to be able to sync the files that I need on the fly.

At some point, I guess the question became, so why am I obsessed with needing this cloud server again?

And that led me in the general direction of Syncthing, a software tool that is designed to sync software across a number of devices.

It’s effectively a peer-to-peer network of your own devices, that doesn’t require a central server at all. Instead, you share data with different computers on your local stack in a way that keeps the overall network private.

Through its web interface, you can decide on which folders go to which machines. Generally, for me, this has meant putting my writing on every machine I use so I can access it anywhere, but putting all my large files on my Xeon. With traditional cloud services, this generally means that the files need to take a long journey. But if they’re on the same home network, why not sync them along the same home network?

I was actually kind of shocked at how well it worked; files transferred instantly. Despite being a technology that relies on a loose network of user-created front-end applications, its sync tools are impressively robust, and allow you to effectively do most of the heavy lifting around syncing without missing a beat. (One wonders if the programming language plays in this; NextCloud, based on PHP, crawls compared to the much snappier Syncthing, which is based on the far-newer Go.)

Honestly, the only real thing I found that gave me problems, and was really more an Android problem than a Syncthing problem, was Android’s inability to handle files with special characters in the naming convention … something that’s bound to naturally happen when you produce thousands of text files in a given year. (I batch-renamed a bunch of files, and soon enough, my problem was solved.)

Of the solutions I tried, Syncthing was easily the closest thing I found to a set-it-and-forget-it ideal when it came to a self-hosted solution. You honestly don’t even need a cloud with Syncthing if you don’t want one.

(Billy Huynh/Unsplash)

The solution I landed upon

So, armed with the knowledge that Syncthing is awesome but didn’t cover every one of my bases, I went with a hybrid approach. Rather than attempting to embrace one solution for everything, I decided a mix of solutions was the way to go, each optimized for specific needs.

  1. NextCloud for standard document editing and office-style applications, which can be useful in cases when I’m not near my machine or I want to make a quick edit to a file on mobile. This sync runs on just one machine, my Xeon—the same Xeon that hosts the server on Docker—and only stores essentials like text files and images at this juncture. (Essentially, I took away NextCloud’s need to sync most of my files.)
  2. Syncthing for file sync across a variety of machines. This runs on every machine I rely on, including iOS and Android.
  3. Backblaze B2 for long-term cloud file storage, which I manually handle once a week through the command-line tool Rclone. (Info here; I could easily automate this.)

At this time, this is all running on the Hackintosh, but I could move it over to a Linux installation if I so preferred over time.

Now, obviously, using three solutions in lieu of what took me just one piece of software previously is clearly not fully integrated, but even with that limitation in mind, this is actually fairly slick, as I get most of the benefits of instant syncing while not constantly running my content through a cloud server hundreds or thousands of miles away. Really, the only files I generally need immediate access to are text files and image files; if I download an ISO file to my Downloads folder and it syncs via Syncthing, I may not want that to sync to B2. But that’s why the external sync isn’t immediate and can be manually controlled.

One benefit of this strategy is that it’s fairly low-cost, with the potential to get even cheaper. Backblaze B2 has a neat feature that integrates really well with Cloudflare. Effectively, if you set up a server rule for Backblaze, you can use a custom domain for your website that will let you access files through Cloudflare’s cache, effectively avoiding egress charges that can kill the financial benefits of cloud storage of this nature. And because uploading content tends to be less expensive than downloading content with B2, the benefits of this approach are pretty strong—if you’re comfortable with the security ramifications of such an approach. (There are a couple of integrations that need to happen, but this would be a decent solve to NextCloud’s syncing issues if it added support for the above Cloudflare feature.)

I should note here that I hold nothing against Wasabi; during the month I tried it, I found it reliable and easy to use, and the costs were hard to argue with. Unfortunately, the apps I wanted to connect it to were much more flaky in nature, which made the decision to take the jump with them much harder. And for my use case—I’m not a video editor, I’m not working with more than a terabyte or two of data I want to have accessible across my machines—it seemed like B2 would be a better choice for my use case. (And again, reminder that Backblaze offers a full service that works just like this!)

So yes, this combination is significantly cheaper than paying Dropbox or Google to store my cloud files, by a wide degree. I still have the “what if the power goes out” problem, but it’s mitigated to a degree by the fact that I’m sharing files across multiple personal machines, as well as on a storage provider as a backup. And maybe at some point I’ll decide to do this on a Raspberry Pi or something that will be low-power enough that it could run off a battery as a backup. You don’t need an old Xeon to do this; it’s just what I had lying around.

Are there ways to extend this? Yes, of course. I still would like to figure out ways to integrate my approach with Zapier or an alternative like Integromat or the self-hosted N8n. But overall, this approach has worked well for me across platforms and different tool sets.

With apologies to the weekend warriors running home labs, I think the average person is probably better off with something like Dropbox, OneDrive, or Google Drive, at least at this juncture.

But I do think that there is a natural tendency, as an end user of those services, to get too comfortable with what you already have and not be willing to shake up the routine and see what’s available elsewhere.

Your goal may be security, or it may be trying to get rid of another bill. Or maybe you just want to take a different approach. It’s not 2007 anymore; you have the options at the ready.

If you have terabytes of data stuck somewhere, that may be an uncomfortable conversation to have, but in a way, I’m kind of glad that Dropbox forced me to broach it. I don’t have to feel set in my ways anymore.

--

Find this one an interesting read? Share it with a pal!

And thanks again to Nightfall for sponsoring.