Published Thursday May 18 2023 at 09:22.
Hacker News new | past | comments | ask | show | jobs | submit login Go and SQLite in the Cloud (golang.dk) 145 points by subomi 5 months ago | hide | past | favorite | 118 comments
markusw 5 months ago | next [–]
Hey! I'm the author of the article. Saw a sudden influx of traffic from HN and found the post here. Happy to answer any questions. Although SQLite is so dead-simple-but-awesome that you probably don't have any. :D
Anyway, hope you enjoy it and learn a thing or two.
Markus
nileshtrivedi 5 months ago | parent | next [–]
I'd like to avoid writing boilerplate CRUD code. Is there an equivalent of PostgREST but for SQLite? Essentially, a standard binary would read the schema metadata, and generate standard CRUD APIs: https://postgrest.org/en/stable/api.html The APIs can even support authentication and authorization with the help of JWT tokens. SQLite may not have row-level security, but even a convention (eg: if a row has user_id column, the JWT must have the same user_id value to get access to a row) would go a long way.
iccananea 5 months ago | root | parent | next [–]
Not exactly what you want, but I've been using https://sqlc.dev to generate hydration code from SQL queries and loving it!
LVB 5 months ago | root | parent | prev | next [–]
I've not use this myself, but Ben Johnson's https://github.com/benbjohnson/postlite in front of SQlite might allow you to use PostgREST? I recall him saying on a podcast that his goal was to be able to point the large ecosystem of PG tools at SQlite.
0cf8612b2e1e 5 months ago | root | parent | prev | next [–]
A long way off from PostgREST, but you might be able to hack together something sort of similar with Datasette [0] and the new JSON api. Not out of the box at all, but the potential is there to get a similar product. [0]: https://datasette.io/
nileshtrivedi 5 months ago | root | parent | next [–]
AFAIK, Datasette only offers read-only APIs.
0cf8612b2e1e 5 months ago | root | parent | next [–]
Not anymore [0]. The announcement [1] [0]: https://docs.datasette.io/en/latest/json_api.html#the-json-w...
[1]: https://simonwillison.net/2022/Dec/2/datasette-write-api/
mjay23 5 months ago | root | parent | prev | next [–]
Yes there is but not FOSS https://github.com/subzerocloud/showcase/tree/main/flyio-sql...
Cwizard 5 months ago | root | parent | prev | next [–]
Have you tried pocketbase?
yandrypozo 5 months ago | parent | prev | next [–]
Nice article, what is your opinion on pocketbase.io which match your title really well, it would be great a follow-up article for LiteFS and pocketbase :)
markusw 5 months ago | root | parent | next [–]
I don’t know pocketbase, will have a look.
maxpert 5 months ago | parent | prev | next [–]
Hey, I am author of Marmot (https://github.com/maxpert/marmot) would love to get in touch and explore a few things with you. I’ve been using Marmot in production in a while scaling couple of services, and since you mentioned LiteFS I would love to pick your brain, and explore couple of ideas with you. Multiple ways to get in touch including Github discussions or discord channel. Hope to see you around!
markusw 5 months ago | root | parent | next [–]
Feel free to reach out, markus@maragu.dk
marktangotango 5 months ago | parent | prev | next [–]
I suggest you run some simple load tests that do inserts, you'll find WAL is not sufficient for concurrency. Multiple reader, single writers handle this well though.
sharps_xp 5 months ago | root | parent | next [–]
You can put all your writes in a queue to be processed by a finite set of writers. I think most CRUD applications can afford the delay and if not you can pair it with a write through cache.
belmont_sup 5 months ago | root | parent | next [–]
It sounds simpler to setup postgres and never have to set up a queue (unless you’re using an in memory queue, and that has its drawbacks). Although I definitely would like to use SQLite just for the cost savings for something. Litefs/litestream looks great.
infamia 5 months ago | root | parent | prev | next [–]
I was surprised at the amount of performance SQLite provided. I get about 700 inserts/sec with the WAL enabled sitting behind a Django app. https://imgur.com/a/a3U41Zo
Without the WAL enabled, I get a around 400req/sec.
edit: clarity
thomascgalvin 5 months ago | root | parent | prev | next [–]
Concurrent writes are the bane of SQLite's existence, but for a one-person blog, you shouldn't run into any issues. You would have trouble scaling up to fives of authors, though, which would be a deal breaker for any serious production app.
markusw 5 months ago | root | parent | next [–]
@levelsio famously runs a dozen websites all backed by SQLite, with 200M reqs / month: https://twitter.com/levelsio/status/1520356430800617472 Expensify got 4 million request _per second_ out of a custom SQLite-based setup (on a huge machine, but still): https://blog.expensify.com/2018/01/08/scaling-sqlite-to-4m-q...
I would call those serious production apps.
thomascgalvin 5 months ago | root | parent | next [–]
I specifically cited concurrent writes. SQLite is fantastic for read-only data. Those websites are using SQLite for what it's designed for. But, if they were trying to use SQLite to do even dozens of writes per second, it would crash and burn.
markusw 5 months ago | root | parent | next [–]
Well, that claim is fairly easy to test. So I did: https://gist.github.com/markuswustenberg/0030360d7d4679f2656... Nothing fancy, just a small table that holds some blog posts with an ID, a title, some content, and a creation timestamp.
I ran the benchmark with and without WAL enabled on my Macbook Air (2020, M1) with some SSD drive inside. Results:
$ make benchmark go test -bench=. goos: darwin goarch: arm64 pkg: sqlite BenchmarkWriteBlogPost/write_blog_post_without_WAL-8 6441 191735 ns/op BenchmarkWriteBlogPost/write_blog_post_with_WAL-8 102559 11205 ns/op PASS ok sqlite 3.725s That's around 89k writes per second in parallel on all available cores with WAL enabled. I know this is a trivial setup, but adjust to your liking. You'll find that SQLite probably doesn't crash and burn with dozens of writes.
thomascgalvin 5 months ago | root | parent | next [–]
This is single-threaded, which a production app would not be. That's when things get scary with SQLite. Yes, I was incorrect when I said SQLite couldn't handle lots of writes quickly; what I should have said is that it can't handle lots of writes from multiple threads quickly.
markusw 5 months ago | root | parent | next [–]
Okay, fair enough. I thought my previous test was parallel as well, but I was mistaken. Here's one for 64 parallel writes: https://gist.github.com/markuswustenberg/f35ab7e191137dca5f7...
$ make benchmark go test -bench=. goos: darwin goarch: arm64 pkg: sqlite BenchmarkWriteBlogPost/write_blog_post_without_WAL-8 100 16317415 ns/op BenchmarkWriteBlogPost/write_blog_post_with_WAL-8 1196 1198679 ns/op BenchmarkWriteBlogPost/write_blog_post_with_WAL_and_Go_mutex-8 58455 17461 ns/op PASS ok sqlite 4.557s 1e9/1198679 = 834 writes per second is still far from crash and burn territory when using just WAL mode. Of course, it gets more interesting when there are also concurrent readers, as people are trying out elsewhere in the discussion. But the point still stands: it can handle way, waaaay better than "fives of authors" in a not-at-all scary way.
marktangotango 5 months ago | root | parent | next [–]
You should duplicate this test against postgres inserts. I did the same thing for sqlite; in go using a mutex on inserts, and sqlite still out performs postgres on inserts (on the same machine).
cy_hauser 5 months ago | root | parent | prev | next [–]
I'm not understanding this. Wouldn't the lower bound on performance be the single thread? Any worse performance and you queue the write requests to match the single thread performance.
thomascgalvin 5 months ago | root | parent | next [–]
No, because of SQLite's locking mechanism. It uses filesytem-level constructs to support concurrency, and this gets dicey, quickly.
markusw 5 months ago | root | parent | next [–]
As my test above shows, use a mutex for writes in Go directly and whatever locking performance problems SQLite exhibits disappear. I suspect because of the polling used in SQLite as described elsewhere on this page.
mathewvp 5 months ago | parent | prev | next [–]
I have a very basic question regarding SQLite. How does it work in different languages? Is the SQLite server code completely implemented in Go when we use go and completely implemented in php when we use php? Because there is no actual server and it is just a file, I'm wondering who does the ACID compliancy and query executions and all the other things needed for a db server and how it is done.
markusw 5 months ago | root | parent | next [–]
The implementation used in this article is actually in C (which is the official, extremely well-tested one), with bindings to the Go language through something called CGo. There's no "server", it's all just code run in your application directly, so think of it more as a library that can access a special database file on your disk. I don't know how it's done in PHP, I would think it's similar.
longcommonname 5 months ago | parent | prev | next [–]
Would you provide a summary of how you noticed the increased traffic and how you found this post?
markusw 5 months ago | root | parent | next [–]
Saw the request spike, searched HN for "sqlite".
cube2222 5 months ago | prev | next [–]
It's also a really nice combo for simple automation lambdas that need some state and you want an ergonomic DB without paying for full RDS. Go + Lambda + EFS + SQLite work great for that.
emadda 5 months ago | parent | next [–]
Do SQLite write locks work properly over EFS with many concurrent instances? https://www.sqlite.org/useovernet.html
cube2222 5 months ago | root | parent | next [–]
Yes, since 2017: https://aws.amazon.com/about-aws/whats-new/2017/03/amazon-el...
emadda 5 months ago | root | parent | next [–]
Wow, I did not expect that. And you have had no issues writing to a single SQLite file over EFS from multiple instances?
cube2222 5 months ago | root | parent | next [–]
I haven't really stress-tested it yet. My automation lambdas that use this tend to be low traffic.
lormayna 5 months ago | parent | prev | next [–]
It's not relational, but MongoDB has a generous free tier. Or maybe you can use Aurora Serverless, it's in the free tier as well (but it's designed for different use cases).
davidjfelix 5 months ago | parent | prev | next [–]
Do you limit concurrency to 1 with this setup?
cube2222 5 months ago | root | parent | next [–]
You can, though then you get errors when invoking the Lambda too much at the same time. Just letting it block on SQLite db opening should be good enough.
davidjfelix 5 months ago | root | parent | next [–]
there aren't any issues with having multiple lambdas opening the same sqlite file?
markusw 5 months ago | root | parent | next [–]
SQLite is designed for concurrent access to the same db from multiple processes. All relevant locking happens in the DB file itself.
davidjfelix 5 months ago | root | parent | next [–]
My concern is mostly around EFS consistency. SQLite concurrency locking is only as good as EFS file consistency. I wasn't sure if you had to work around any oddities of NFS. https://twitter.com/benbjohnson/status/1360592483969810443?l...
simscitizen 5 months ago | prev | next [–]
It doesn't strike me as the best language for embedding SQLite given the need to constantly cross the Cgo boundary. But I'm sure it still works fine.
Loic 5 months ago | parent | next [–]
It works very well. We are running a search engine for chemical properties in Go+SQLite and the speed is simply incredible to the point people think it is a static website. Here is a page with quite some queries to render: https://www.chemeo.com/cid/58-801-8/Pentane
markusw 5 months ago | root | parent | next [–]
Haha, I’ve heard that comment before as well. If it’s _too_ fast, it feels suspicious, like it’s not doing anything. I think we’ve collectively forgotten how fast dynamic websites can be on modern hardware!
mappu 5 months ago | parent | prev | next [–]
> A benchmark performed on go 1.15 showed 60ns of overhead for calls into C This concern is a bit overblown, the Cgo boundary is heavier than an ordinary function call but still a zillion times faster than doing disk I/O on the database.
iansinnott 5 months ago | parent | prev | next [–]
Although not mentioned in the article, there is a CGo-free port of SQLite which can be used as an alternative to the usual driver: https://pkg.go.dev/modernc.org/sqlite
Cwizard 5 months ago | root | parent | next [–]
Not relevant if you are interested in performance. This go only version is much slower than the cgo overhead. (At least it was a year ago, do your own benchmarks)
leg100 5 months ago | prev | next [–]
How does one perform deployments with go+sqlite? With a client-server database such as postgres your app and database are on separate servers, and you can perform a blue-green, canary, etc, deployment of the app, spinning up new servers running the new version alongside the servers running the old version, before shutting down the servers running the old version. But with sqlite you'd have to perform a hot-upgrade, surely? i.e. shut down the old version and quickly fire up the new version, with a small window of downtime in between.
Note: I see Litestream/LiteFS allows distributed deployment (both in beta).
markusw 5 months ago | parent | next [–]
Essentially yes, with a single SQLite instance, you have a small window of downtime on every deploy. But depending on your setup, that window could be very small, and unnoticeable if you have something in front that just delays incoming requests (like fly.io does with a load balancer in front of that single instance). And yes, LiteFS just selects a new leader and does a rolling deploy (or whatever else you want).
ngrilly 5 months ago | root | parent | next [–]
Another, more hacky option, would be to teach the binary executable how to download a new version, and start it to replace itself, while staying in the same container. Nginx does this for example. Of course, that’s some additional complexity, but then it is possible to have zero downtime upgrades (by basically upgrading the executable in the container but not the container itself). It is also possible, and simpler to upgrade the container, if it’s possible for different containers to share the same volume, but I don’t think this is possible on Fly.io.
markusw 5 months ago | root | parent | next [–]
I believe systemd can do something similar, essentially replacing the binary and queueing network calls while doing so.
ngrilly 5 months ago | root | parent | next [–]
Yes, systemd does that. But as you wrote, it will queue network calls. The nice thing with the method I suggest above is that both Go processes, the old one and the new one, can be both serving requests in parallel, as they can share the same SQLite file, which makes it really zero-downtime.
LVB 5 months ago | prev | next [–]
This is a good article that neatly covers a lot of the tips I've seen spread across various posts, talks, etc. Thanks! I'm curious what experience you or others have dealing specifically with the write concurrency elements of this setup, should you find it is actually an issue. I've occasionally seen mention of restructuring things to queue writes within the app (e.g. having a single writer goroutine fed with a channel), but I be interested in more details about when folks hit the point of needing to do that, and what they did (and did it help?)
simonw 5 months ago | parent | next [–]
If you turn on WAL mode SQLite will queue the writes for you, and since most writes complete in under 1ms you'll likely find that this isn't really worth worrying about at all. I did some trivial benchmarking around this recently: https://simonwillison.net/2022/Oct/23/datasette-gunicorn/
dinosaurdynasty 5 months ago | root | parent | next [–]
WAL mode does have some slight decrease in durability by default. If you pull the power immediately after a commit the commit may be reverted when you come back up. But yeah most of the time it isn't an issue.
markusw 5 months ago | root | parent | next [–]
This is news to me. What’s your source for that? AFAIK, SQLite is fully and completely ACID also in WAL mode.
dinosaurdynasty 5 months ago | root | parent | next [–]
https://www.sqlite.org/pragma.html#pragma_synchronous When synchronous is NORMAL (1), the SQLite database engine will still sync at the most critical moments, but less often than in FULL mode. There is a very small (though non-zero) chance that a power failure at just the wrong time could corrupt the database in journal_mode=DELETE on an older filesystem. WAL mode is safe from corruption with synchronous=NORMAL, and probably DELETE mode is safe too on modern filesystems. WAL mode is always consistent with synchronous=NORMAL, but WAL mode does lose durability. A transaction committed in WAL mode with synchronous=NORMAL might roll back following a power loss or system crash. Transactions are durable across application crashes regardless of the synchronous setting or journal mode. The synchronous=NORMAL setting is a good choice for most applications running in WAL mode.
markusw 5 months ago | root | parent | next [–]
Interesting, thanks! I didn’t know this was a configurable tradeoff. Might update the article with a little bonus info.
mappu 5 months ago | parent | prev | next [–]
Go + SQLite user at $DAYJOB here. WAL is essential, and the next essential trick is to open a single db.Conn up-front and hide that behind a mutex for any transactions/queries that will write. All callers can pick this.db_rw or this.db_ro as required.
markusw 5 months ago | root | parent | next [–]
What’s the advantage of the mutexed rw connection compared to the busy timeout and built-in locking?
mappu 5 months ago | root | parent | next [–]
The built-in one actually sleeps and polls to check if the write lock is free again. It's a disaster for high write throughput. https://sqlite.org/forum/info/00312b3d02bc0583
Sqlite's internal one has to work this way to support their multi-process guarantees. In comparison, an in-process Go mutex will be readied immediately.
LVB 5 months ago | root | parent | next [–]
Very interesting. I'm now keen to write a very simple benchmark comparing the results of N goroutines leaning on the WAL lock vs in-app serialization.
philosopher1234 5 months ago | root | parent | next [–]
would love to see the results of that, if you decide to do this
LVB 5 months ago | root | parent | next [–]
I did a small benchmark and found for high concurrency write-only loads, the results of each goroutine writing directly vs using a mutex were basically the same. But things were more interesting when I added a bunch of read goroutines to the mix too (which would be more realistic for e.g. web traffic). In that case, the mutex-protected writes were drastically reduced in favor of reads. And while I have no idea how SQlite is actually handling its locking, it makes intuitive sense that if it doesn't even know about more write load it isn't going to schedule it. I'm going to clean up the code and post some results in gist, but here is a sample:
Write-only (SQL statements/s):
direct: 45314 w/mutex: 45304
Read/Write mix:
direct: read: 85229 write: 22664 w/mutex: read: 139053 write: 1382
mappu 5 months ago | root | parent | next [–]
> I did a small benchmark and found for high concurrency write-only loads, the results of each goroutine writing directly vs using a mutex were basically the same. I'm interested to see your gist, but just to note this specific part seems to contradict https://news.ycombinator.com/item?id=33911491 , where at 64-wide parallelism for a write-only load, the Go mutex outperforms plain WAL by a factor of ~70x,
LVB 5 months ago | root | parent | next [–]
Pretty interesting results by adding SetMaxOpenConns() to Marcus' gist: https://gist.github.com/markuswustenberg/f35ab7e191137dca5f7...
LVB 5 months ago | root | parent | prev | next [–]
That revised version is indeed interesting. I hope to get things pushed tonight. My approach managed goroutines directly and didn't use the Benchmark parallelism feature. I don't see any obvious reason why that matters but structurally that's a bit different
markusw 5 months ago | root | parent | prev | next [–]
I did a small benchmark on some trivial code: https://gist.github.com/markuswustenberg/8be63c6e95dc5bb50ff... Results vary a bit on my machine, but it's pretty much the same (unless I'm doing something wrong, please point it out if so):
$ make benchmark go test -bench=. goos: darwin goarch: arm64 pkg: sqlite BenchmarkWriteBlogPost/write_blog_post_without_WAL-8 6036 186719 ns/op BenchmarkWriteBlogPost/write_blog_post_with_WAL-8 102567 11169 ns/op BenchmarkWriteBlogPost/write_blog_post_with_WAL_and_Go_mutex-8 104228 11226 ns/op PASS ok sqlite 3.833s
markusw 5 months ago | root | parent | next [–]
My mistake, this isn’t parallel. See https://news.ycombinator.com/item?id=33911491 for a parallel version with no readers.
LVB 5 months ago | root | parent | prev | next [–]
Finally got around to this: https://github.com/kalafut/go-sqlite-bench
markusw 5 months ago | root | parent | prev | next [–]
Ooooh, I didn’t know that detail. Thank you for sharing!
markusw 5 months ago | parent | prev | next [–]
Thank you for the kind words! :) The built-in busy timeout basically takes care of write queueing, so a standard setup like this will take you very, very far.
rtukpe 5 months ago | prev | next [–]
This is nice, I've used Litestream for a personal project. I wonder how it compares to something like rqlite [1] with larger datasets [1] https://github.com/rqlite/rqlite
otoolep 5 months ago | parent | next [–]
rqlite author here, happy to answer any questions. The rqlite FAQ[1] might be useful to you. [1] https://github.com/rqlite/rqlite/blob/master/DOC/FAQ.md
mlangenberg 5 months ago | prev | next [–]
I feel that I am spoiled with a great graphical user interface being available to explore MySQL databases in the form of Sequel Ace for macOS, to the point that I sometimes load an sqlite database in MySQL just to be able to browse through it with Sequel Ace. Any recommendations for a macOS GUI for sqlite that comes close to Sequel Ace? I have tried DB Browser for SQLite, but that feels a bit outdated to be honest.
markusw 5 months ago | parent | next [–]
I’m pretty happy with TablePlus. Does all the big dbs.
hapidjus 5 months ago | root | parent | next [–]
Also happy user of TablePlus. There is a free version that is limited to two tabs and some other stuff so you can try before you buy.
siliconc0w 5 months ago | prev | next [–]
Still not convinced sqlite is a good default, you can also run mysql/postgres locally which allows you similarly eliminate network hops and gives you the option to separate out things down the road. I'm not sure how Sqlite handles concurrent readers/writers these days but you'll probably at least end up scaling concurrency even in the early stages for things like async tasks.
markusw 5 months ago | parent | next [–]
It’s not just extra hops: there’s no network! (In distributed SQLite for reads at least.) Great for read-heavy workloads, and no n+1 query problems. Concurrent readers is no problem. No concurrent writes right now.
sakopov 5 months ago | prev | next [–]
I've become really interested in SQLite when I heard about Litestream. What kind of deployment strategies do folks use to deploy SQLite and Litestream for production use? Also, AFAIK writes can only be done via master instance, so how do you control read/writes from multiple app instances?
markusw 5 months ago | parent | next [–]
Have a look at the followup article that touches on distributed SQLite with LiteFS. :)
0cf8612b2e1e 5 months ago | root | parent | next [–]
Only had a chance to skim the LiteFS article, but one use case closer to my heart would be blue-green deployments against the same SQLite database for zero-downtime deployments. Conceptually, I think I could use LiteFS to the same effect as distributed, but would love to see a write-up or any gotchas likely to occur.
markusw 5 months ago | root | parent | next [–]
I don’t think there’s too much difference to “regular” distributed SQLite, but I’ll have to think about that.
fyresala 5 months ago | prev | next [–]
You won't gain much by the combination of golang, sqlite and the cloud. You can't scale out and the cloud layer only makes sqlite slower. I can't see any reason not using RDS other than it's more expensive. I would only say this is a quick way to run up an application with a cheap VPS for a beginner.
benbjohnson 5 months ago | parent | next [–]
Litestream/LiteFS author here. I agree that "cloud" is a bit ambiguous but Go & SQLite are quite powerful together and I don't think it's only for beginners. Both are fast and have low overhead. In addition to lower cost versus RDS, there's near-zero query latency which eliminates a lot of performance problems. You can comfortably run tens or hundreds of requests per second on minimal hardware (e.g. 256MB or 512MB instances). There's a lot of room for scaling up before you hit a performance ceiling.
msolberg 5 months ago | root | parent | next [–]
I just watched your gophercon video on SQLite in production after seeing it in the article. Great talk. Anyone else who's interested can watch here: https://www.youtube.com/watch?v=XcAYkriuQ1o
benbjohnson 5 months ago | root | parent | next [–]
Thanks!
ehutch79 5 months ago | root | parent | prev | next [–]
Tens of requests a second?
benbjohnson 5 months ago | root | parent | next [–]
A request can have a large range of queries within it so I was trying to account for that. If your requests are lightweight and mostly reads, you can do 1,000+ req/sec on a 256MB instance. YMMV.
randomdata 5 months ago | root | parent | prev | next [–]
He's being generous. Indeed, your service will be lucky if it sees tens of requests per day.
ehutch79 5 months ago | root | parent | next [–]
:-| Normally I'm on the side of 'do you actually get that much traffic?' but yeah, a dashboard view can generate dozens of requests alone, each with multiple queries.
tacitusarc 5 months ago | root | parent | prev | next [–]
Do you mean tens or hundreds of thousands of requests per second?
benbjohnson 5 months ago | root | parent | next [–]
There aren’t many apps that have 100,000+ req/sec and you certainly can’t run them on modest hardware. If your app averages 100 req/sec then that’s 8.4M requests per day. That’s more than most applications out there.
ketralnis 5 months ago | parent | prev | next [–]
This is a pretty strong claim without any numbers, to be honest. You were probably already running your single database instance so being limited to a single sqlite instance isn't terribly different. This does limit you to a single app server, which unless you're CPU bound is also fine. Even if you are, you can get a lot of cores in a single instance these days. Most things will never need to be scaled up
aynyc 5 months ago | root | parent | next [–]
AFAIK, sqlite3 doesn't support multi-core, so you can only do vertical scaling. In cloud, people rather do horizontal scaling. That being said, I ran a django based webapp for internal users and we saw core being pegged by sqlite3 when the queries were complex. However, under normal usage, it was fine for about 100 concurrent users including machine API calls.
cldellow 5 months ago | root | parent | next [–]
> doesn't support multi-core Can you expand on that? I thought it supported multiple readers, but only a single writer. They do all have to be on the same host, though.
Since you mentioned Django - I think there are some complexities where the Python driver can't really do concurrent access if using threads instead of processes, but this is due to Python/GIL limitations, not sqlite limitations.
ketralnis 5 months ago | root | parent | next [–]
> Python driver can't really do concurrent access if using threads instead of processes, but this is due to Python/GIL limitations Confusingly you can't run 2 lines of Python code at the same time, but you can run 2 SQL queries (sqlite, remote server, etc) at the same time. Python threads are true pthreads, they just need to hold the GIL while running Python code. They release it when running C code or doing I/O. So you can have 2 queries running at the same time no problem, but you'll only get a single core of compute processing their results.
stonemetal12 5 months ago | root | parent | prev | next [–]
It is even a little better than that these days. Beginning a transaction normally locks the whole database, but with Wal mode they have added a Begin Concurrent command so there can be multiple transactions going at once. However when you get to the commit it still requires the exclusive lock. I bet after a few years even that restriction will be gone. https://www.sqlite.org/cgi/src/doc/begin-concurrent/doc/begi...
SPBS 5 months ago | root | parent | next [–]
https://sqlite.org/forum/info/2ad28cb1e0356816e7b0bd2ab458f7... I don't think BEGIN CONCURRENT is out yet. AFAIK it's only available if you build the experimental wal2 branch yourself.
stonemetal12 5 months ago | root | parent | next [–]
I guess I miss read the branch list, I thought the begin-concurrent-3.39 branch said it had merged to trunk but that is the line above it.
aynyc 5 months ago | root | parent | prev | next [–]
sqlite3 is a single core application, so even if you have multiple cores on a box, sqlite3 isn't going to take advantage of that. If you want sqlite3 to go faster, you need better single core box. That being said, I do believe sqlite3 allow multiple processes to read the same database file. Our django app sits behind Gunicorn that will spin up multiple django instances, as long as Sqlite3 has WAL mode on, concurrent access (read) isn't a problem. For write, we basically use a lock.
ketralnis 5 months ago | root | parent | next [–]
This is confusingly worded but for onlookers: sqlite as you think of it isn't an application at all, it's a C library. There does exist a binary called sqlite3 which is an interface to that library, and that binary is (mostly, though not exclusively) single-threaded. But unless you're using that specific tool, and you're probably not unless you yourself are typing "sqlite3" into a bash shell, you're using the library rather than this query tool. And that library interacts in a similar way to a socket connect or a fopen call. You can have multiple connections open to the same sqlite database, and those connections can operate concurrently (subject to various limitations, but "you need better single core box" is not the summary)
markusw 5 months ago | root | parent | prev | next [–]
When you embed it in Go, you can have concurrent reads. Each HTTP handler request gets a separate goroutine, so you can definitely take advantage of multiple cores.
ketralnis 5 months ago | root | parent | prev | next [–]
SQLite actually scales far better than the memes would generally tell you and that "lite" implies. It's frustrating how easily these memes spread by people just repeating what they heard once. If there’s anything HN has taught me it’s to check the docs instead of believing these ambient notions It's more complicated than "doesn't support multi-core" which you can see in the other replies here: generally unlimited concurrent readers are allowed with single or at least finite writers. Depending on a bunch of settings (e.g. WAL) you might also get concurrent writers, multicore sorting, and some other things that can cross cores too. Those things do have their own tradeoffs.
But my actual claim is that most things don't need to be "scaled" at all. And if you do get there, and again statistically you won't, then you're definitely going to be doing some other rearchitecting anyway and moving sqlite->postgres might as well be part of that.
aynyc 5 months ago | root | parent | next [–]
I never said it's "lite". I can say for certain that in my experience, when writes start to go up, such as stock market event stream coming in, sqlite3 doesn't perform as well as postgres, and scaling writes is hard in sqlite3. But I have no issue scaling Postgres for those data. If you say, sqlite3 is good enough for most webapps, then I probably agree. But I would not use it for a lot of things that I use Postgres for such as pubsub, or really high transaction rate.
saila 5 months ago | root | parent | prev | next [–]
Not saying SQLite wasn't the issue here (impossible to tell based on the information provided), but I've seen a lot of webapps with highly inefficient queries because developers either didn't understand what the ORM was doing or didn't bother to optimize. In a Django app, prefetch_related can make a huge difference (or joinedload when using SQLAlchemy or better initial filtering when using raw SQL).
aynyc 5 months ago | root | parent | next [–]
We used mostly raw SQL. Django ORM is good for CRUD app, but anything analytic style app, it's more painful than it should be. I'm not blaming Django, just we picked the framework for CRUD work and had to retrofitted to do analytic because of requirements.
markusw 5 months ago | parent | prev | next [–]
See my followup-article "Distributed SQLite with LiteFS" for the scale-out part: https://www.golang.dk/articles/distributed-sqlite-with-litef... And IMO, what you lose in the cloud layer you gain by having a deployment really close to your user. Speed of light and all that.
Plus SQLite is super cool and fun! :D
infamia 5 months ago | parent | prev | next [–]
Once you need to graduate from one large server (which will take you a long way in many cases), there are tools like [rqlite](https://github.com/rqlite/rqlite) that can handle clustering. With WAL mode enabled, SQLite can handle a surprising amount of traffic that would fit a lot of use cases. If latency is important to you, it's going to be hard to beat SQLite for many workloads.
mariusor 5 months ago | parent | prev | next [–]
It's a little weird that you consider bringing up an application on bare metal something for a beginner. :) I think manually handling the nitty-gritty of an OPS setup is a more senior task than clicking you way into setting up some AWS infrastructure.
iveqy 5 months ago | parent | prev | next [–]
Depends on your application. On the opposite I've seen many cases where sqlite3 scales better than postgres, depending on the scale and the application.
If you're doing horizontal scaling, sqlite3 is a very good alternative.
0cf8612b2e1e 5 months ago | parent | prev | next [–]
> I can't see any reason not using RDS other than it's more expensive. We don’t all operate with unlimited VC money.
randomdata 5 months ago | parent | prev | next [–]
You gain low latency, which allows you to write your code to a 'pure' relational model (at least as close as SQL allows), without having to rely on code complicating hacks to deal with round-trip issues. Less complicated code can improve delivery time and reduce faults. While there are abstractions that can be used to help with those hacks, they come with their own tradeoffs. Pick your poison, as always. Conceptually, RDS is little more than SQLite with a clever networking layer built on top. In context, your application is also just a clever networking layer, so the middleman doesn't necessarily add any value. Of course it depends on exactly what you are trying to do and what tradeoffs you are willing to accept. There is no free lunch.
simonw 5 months ago | parent | prev | next [–]
At what point would you expect to need to scale out?
hackerbrother 5 months ago | prev | next [–]
FYI-- it appears your Mastodon link needs "op" changed to "io"!
markusw 5 months ago | parent | next [–]
Ooops, thanks! Deploy on the way with a fix. :D
SergeAx 5 months ago | prev [–]
I was a big fan of SQL as a language since meeting it in MS Access 95, but today I strongly believe that it became obsolete and too clumsy. I am now in favor of something more strict and structured, like MongoDB fully JSON-powered queries. Unfortunately, Mongo currently is heading directly into corporate hell, and with their uncanny license there's no way FOSS community will touch anything related with 6 foot pole in a nearby future.
Applications are open for YC Summer 2023
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Search: