Discussion
Big Data on the Cheapest MacBook
TutleCpt: Oh great, the term "big data" is back.
michalc: So my definition of big data was data so big it cannot be processed on a single machine in a reasonable amount of time.I guess they’re using a different definition?
rattray: > For our first experiment, we used ClickBench, an analytical database benchmark. ClickBench has 43 queries that focus on aggregation and filtering operations. The operations run on a single wide table with 100M rows, which uses about 14 GB when serialized to Parquet and 75 GB when stored in CSV format.very much so…
hermanzegerman: That's an awesome idea to get a bricked MacBook Neo really fast because those idiots soldered the SSD inside
bcye: I think they are simply referring to analytical workloads.
speedgoose: Computers got bigger and software got smarter.You have phones that are faster than cloud VMs of the past. You can use bare metal servers with up to 344 cores and 16TB of ram.I used to share your definition too, but I now say that if it doesn’t open in Microsoft Excel, it’s big data.
Zambyte: Processing data that cannot be processed on a single machine is fundamentally a different problem than processing data that can be processed on a single machine. It's useful to have a term for that.As you say, single machines can scale up incredibly far. That just means 16 TB datasets no longer demand big data solutions.
montroser: This is as much an indictment of AWS compute as it is anything else.
opentokix: Mind blown, if you need to handle "big" data on the move - the macbook neo is not the right choice. - Who would have guessed that outcome?
g947o: It occurs to me that there is near zero overlap between people who use a Macbook Neo and people who run DuckDB locally.It would be a surprise if more than 0.1% of Macbook Neo users have even heard of DuckDB.Which means that this article is probably just riding the hype.
Jeffrin-dev: The disk I/O point really stands out to me. I've been working on CleanSweep, a duplicate file finder designed to handle 1M+ files, and the difference between 1.5 GB/s and 3–5 GB/s read speeds is the exact threshold where file traversal workloads go from 'acceptable' to 'frustrating'. For scanning-heavy tasks, the MacBook Neo's NVMe being slower than the Air/Pro isn't just a benchmark footnote — it compounds with file count in a non-linear way. That said, impressive that DuckDB managed SF300 with only 8GB RAM, even if query 67 taking 51 minutes is a red flag for memory-bound workloads.
post-it: Did you use AI to write a one-paragraph comment?
SteveNuts: > MacBook Neo's NVMe being slower than the Air/Pro isn't just a benchmark footnote — it compounds with file count in a non-linear way.The “isn’t just” part is a dead giveaway almost always.
BoredPositron: Queue the endless blog posts about running tech on the potato macbook and being stunned it’s functional with massive trade-offs. Groundbreaking stuff.
Schiendelman: That usage is "Cue", not "queue".
LeifCarrotson: Cue the queue of blogs! Trigger the formation of a line of posts to be published sequentially.
rrr_oh_man: In my former life a soulless consultant mid-level IT managers really liked to hear the 3 "V"s mentioned: Velocity, Volume, Variety
speedgoose: The V of Value is very important in some circles.
brudgers: [delayed]
speedgoose: I get your point, but I don’t know if big data is the right term anymore.Many people like to think they have big data, and you kinda have to agree with them if you want their money. At least in consulting.Also you could go well beyond a 16TB dataset on a single machine. You assume that the whole uncompressed dataset has to fit in memory, but many workloads don’t need that.How many people in the world have such big datasets to analyse within reasonable time?Some people say extreme data.
ody4242: I would have benchmarked with an instance that has local nvme, like c8gd.4xlarge.
ramgale: Seems completely unnecessary, there is probably 0 overlap between people who buy a cheap MacBook and people running DuckDB locally
raincole: The article is literally saying the opposite. Quote:> Here's the thing: if you are running Big Data workloads on your laptop every day, you probably shouldn't get the MacBook Neo.> All that said, if you run DuckDB in the cloud and primarily use your laptop as a client, this is a great device
_joel: Account was created 3 days ago, probably one of those bloody clawdbots.
lachlan_gray: Not sure about the ssd in particular but the neo is apparently pretty modularhttps://www.youtube.com/watch?v=5k7Lv7f-5CQ
Robdel12: I’ve been tempted to buy one and do “real dev work” on it just to show people it’s not this handicapped little machine.I built multiple iOS apps and went through two start up acquisitions with my M1 MBA as my primary computer, as a developer. And the neo is better than the M1 MBA. I edited my 30-45 min long 4k race videos in FCP on that air just fine.
tosh: For the TPC-DS results it would also have been nice to show how the macbook neo compares to the AWS instances.Or am I missing something?
vasco: But AWS beat the laptop? And there's no cost to performance analysis? Yes AWS is overpriced but how do you make that conclusion from this specific article? Because network disks were slower than SSDs? AWS also has SSD instances with local storage.
api: Yeah, this is really about how ludicrously overpriced big cloud is.It’s staggering. Bandwidth is even worse, like 10000X markup.Yet cloud is how we do things. There’s a generation or maybe two now of developers who know nothing but cloud SaaS.I watched everyone fall for it in real time.
nicoritschel: > compared to 3–5 GB/sTheir numbers are a bit outdated. M5 Macbook pro SSDs are literally 5x this speed. It's wild.
varispeed: If you can fit it on a thumb drive, it's not Big Data.
arh5451: I agree and disagree, the benefit with cloud is you "don't need to manage it", it scales automatically, redundancy, and automatic backups etc. I do think you are right; in the future there will be more infrastructure as code as cost pressures become more obvious.
ramgine: I just retired my m1 air to being a server this month. They’re very capable laptops. If the neo is even comparable in spec it’s excellent for the price
clamlady: as a broke ecologist, this little computer can do everything I need in R and word and is a phenomenal build for the price. I'm really enjoying it thus far.
refactor_master: I think it’s relevant to first read [1] to see why they’re doing this. It’s basically done as a meme.[1] https://motherduck.com/blog/big-data-is-dead/
raegis: Can you say a little more about what you mean by "better"? How much faster is editing?
swiftcoder: Better in terms of raw specs. The original M1 Air also came with 8GB of RAM, and the A18 Pro in the Neo is faster than the version of the M1 that shipped in the base model Air
api: Those benefits are at least partly lies though.The tooling — K8S with all its YAML, Terraform, Docker, cloud CLI tools, etc. — is pretty hideously ugly and complicated. I watch people struggle to beat it into shape just like they did with sysadmin automation tools like Puppet and Chef a decade or more ago. We have not removed complexity, only moved it.The auto scaling thing is a half truth. It can do this if you deploy correctly but the zero downtime promise is only true maybe half the time. It also does this at greatly inflated cost.The uptime promise is false in my experience. Cloud goes down for cluster upgrades and any myriad other reasons just as often as self managed stuff. I’ve seen serious unplanned outages with cloud too. I don’t have hard numbers but I would definitely wager that if cloud is better for uptime at all it’s not enough of an improvement to justify that gigantic markup.The only technical promise it makes good on, and it does do this well, is not losing data. They’ve clearly put more thought into that than any other aspect of the internal architecture. But there’s other ways to not lose data that don’t require you to pay a 10X markup on compute and a 10000X markup on transfer.I think the real selling point of cloud is blame.When cloud goes down, it’s not your fault. You can blame the cloud provider.IT people like it, and it’s usually not their money anyway. Companies like it. They’re paying through the nose for the ability to tell the customer that the outage is Amazon’s fault.
Robdel12: Yeah! My M1 air is now my iOS build server since GH actions bill macOS mins at 10x the price.
sam345: Fantastic tear down. Thank you. Amazing for Apple. I hope this is the trend going forward but probably not. But still a gazillion screws? I just replaced the keyboard for my old hp elitebook with two screws.
mazzma: > An alternate definition of Big Data is “when the cost of keeping data around is less than the cost of figuring out what to throw away.”That couldn't be more accurate
TacticalCoder: I'm interested by one (not for big data) but only 8 GB or RAM is kinda really sad.My good old LG Gram (from 2017? 2015? don't even remember) already had 24 GB of RAM. That was 10 years ago.A decade later I cannot see myself being a laptop with 1/3rd the mem.
maratc: Did your LG Gram cost $450 (to make for $600 in today's money) in 2015-17?If it didn't, Apple has other laptops today with more RAM.
jsheard: I'm seeing ~6GB/sec: https://www.tomshardware.com/laptops/macbooks/m5-macbook-pro...That's decently fast but not especially remarkable, most Gen4 NVMe drives can hit those speeds. Gen5 drives are significantly faster but those are still too power hungry for laptops.
lowkj: To be clear, that article is about the base m5, not the m5 pro or m5 max.https://www.apple.com/newsroom/2026/03/apple-introduces-macb..."The new MacBook Pro delivers up to 2x faster read/write performance compared to the previous generation reaching speeds of up to 14.5GB/s..."
jsheard: OP did just say M5 (implying the base model)Those speeds on the Pro/Max are impressive though, more in line with Gen5 NVMe drives. Those have been available in desktops for some time but AFAIK they are still much too power hungry for laptops, so I think Apple is actually the first to hit those speeds on mobile.
bryanrasmussen: why does GH actions bill macOS minis 10X?
tasuki: That's not Big Data. If you "need to process Big Data on the move" - what you need is a network.
red-iron-pine: aye.the laptop is gonna have some local code, maybe a lot, but if I'm doing legitimate "big data" that data is living i the cloud somewhere, and the laptop is just my interface.
onlyrealcuzzo: This is awesome.I wish more companies would do showcases like this of what kind of load you can expect from commodity-ish hardware.
jzebedee: Mins here being short for minutes, not minis.
swiftcoder: I've used MacBook Airs as primary dev machines multiple times in my career (before Apple silicon, when Airs had truly shit performance).There is always a trade-off of cost/convenience/power, and some folks are going to end up the the Neo end of the spectrum.
windowsrookie: Apple has been soldering the SSD into MacBooks for over 10 years now, and most 10 year old MacBooks still have a working SSD.
hermanzegerman: Not if you're powerusing it like in the Article and relying heavily on Swap.Also there are countless reports of bricked M1 8GB MacBook Airs that are bricked because the SSD used up it's write cycleshttps://youtu.be/0qbrLiGY4Cg?si=mjKn2oLjqAb36hPU
cestith: Not all IaC is Kubernetes.
MikeNotThePope: It’s fine to if you don’t have any memory hogging apps. But as soon as you fire up a couple demanding Docker containers you’ll feel the pain. 8GB isn’t so much RAM for some applications.
pbronez: How did you get one already? I thought they were just up for pre-order
aaronharnly: That c8g.metal-48xl instance costs $7.63008 on demand[1], so for the price of the laptop, you could run queries on it for about ~90 hours.:shrug: as to whether that makes the laptop or the giant instance the better place to do one's work…[1] https://aws.amazon.com/ec2/pricing/on-demand/
__mharrison__: When I teach, I use "big data" for data that won't fit in a single machine. "Small data" fits on a single machine in memory and medium data on disk.Having said that duckDB is awesome. I recently ported a 20 year old Python app to modern Python. I made the backend swappable, polars or duckdb. Got a 40-80x speed improvement. Took 2 days.
ladberg: I'm curious - what were you doing that polars was leaving a 40-80x speedup on the table? I've been happy with it's speed when held correctly, but it's certainly easy to hold it incorrectly and kill your perf if you're not careful
1a527dd5: I adore DuckDB.Did a PoC on a AWS Lambda for data that was GZ'ed in a s3 bucket.It was able to replace about 400 C# LoC with about 10 lines.Amazing little bit of kit.
alex_creates: Funny just yesterday I almost bought one but got cold feet and opted for a low range MacBook with M5 chip. The Apple sales rep was not convinced it would be enough when i described using it for vibecoding and deploying so kind of talked me out of getting the Neo. I normally use a mix of LLMs, then connect to Github and do a one-click deploy on CreateOS. Do you think I over-reacted? The price of the Neo is SO attractive, a clean half price compared to what I got.
clouedoc: I know it's not really related, but how did you manage to build two startups worth getting acquired in such a short period of time?
icedchai: With cloud, what you're really paying for is flexibility and scalability. You might not need either for your applications. At some startups, we needed it. We sized clusters wrong, needed to scale up in hours. This is something we wouldn't ever be able to do with our own hardware without tons of lead time.If your application won't ever require more resources than a single server or two, then you are better off looking at other alternatives.
butILoveLife: You could get a laptop with an Nvidia GPU, 16gb ram, 512 ssd... or a 'cheap' Macbook.I totally understand if you need to compile for iphones. We need to make apps for the lower and middle class people that think a $40/mo cellphone is a status symbol. I get it.But if you are not... why? I hate windows, but we have Fedora... and you get an Nvidia. Is it just a status symbol? And I have a hard time believing people who tell me stories about low power consumption, because no one had cared about that until Apple pretended people cared about it.
cosmic_cheese: > And I have a hard time believing people who tell me stories about low power consumption, because no one had cared about that until Apple pretended people cared about it.That’s because battery life was pretty mediocre across the board, with Apple occasionally squeaking out a bit of an upper hand on the Air. Most laptops were in the same boat, aside from gaming and workstation laptops but battery life has never been the point of those.That changed dramatically with the M-series Macs. People didn’t start caring because Apple did, but because it meant no longer being tethered to a wall, being able to do a lot of outings without a brick or charger cable at all, and on extended trips being able to get by with a little phone charger instead of a the usual huge ungainly brick.One of the primary objectives of a laptop is portability, and long life is an objective upgrade in that category. Not everybody needs it but for those who do it’s difficult to give up once you’ve had it.
butILoveLife: Apple's "reputation management"/Astroturfing companies are here.
alpaca128: Imho 8GB RAM for productivity can quickly be restrictive. I used an M1 with 8GB and my current Macbook is M2 with 16GB, and to me the difference feels bigger than 2x. It seems not everyone here feels that way, but I'd say there's a reason Apple bumped the base models to 16 and makes that exclusive to non-Neo models.
nicbou: You get a long-lasting device that's usually pleasant to use. User experience is harder to measure than specs, but at least for me, Macbooks are consistently better laptops than everything else I've used.
butILoveLife: This is why when I see a youtube video and the person is using MacOS, I skip it.;)
butILoveLife: Its either Apple user mentality, or Apple astroturfing.Look at how they tricked people into thinking their CPU could run LLMs and how they sold integrated GPUs as something special with Unified Memory.
eru: People care about how long you can run in between charging. Low power consumption helps with that, even if you don't care about it directly.
hermanzegerman: Where do I get a Laptop with a Nvidia GPU for 600$ ?I'm right now in the Market for a new Laptop, because I need way more GPU Power than my T470 provides, and to be honest the MacBookPros are quite competitively priced compared to the P-ThinkPads with Nvidia Cards. (Both around 3000€) They also finally offer a matte screen optionThe only thing holding me back right now is the soldered SSD, RAM (and shitty Linux support).It was quite nice being able to upgrade RAM, SSD and replace the Battery on it. Otherwise it wouldn't have lasted for 9 years
anthonySs: most dev workflows from pre 2021 can probably run just fine on a NEO - i think once you get into conductor / 8 terminals with claude code territory that’s where things start to slow downi just got an m5 max with 128gb of ram specifically to run local llms
eru: Does Claude Code take up that many local resources? I thought the heavy lifting was in the cloud?
UqWBcuFx6NV4r: What is a macOS mini…
NetMageSCW: Not “mini”, “mins” -> minutes.
butILoveLife: I need to short Apple. A $500 laptop means its no longer a status symbol.
f6v: I never met someone using Apple laptops professionally who thought it was a status symbol. I only keep hearing this from non-Apple users.
antonyh: The only status it brings is "smart enough to not use Windows 11" or "cares enough to get the work done rather than fighting with Linux on laptops".(I use Linux on desktop as a first choice, but it's always been an uphill struggle with laptop wifi/power manglement/audio for me. I blame the esoteric chipsets used in the machines I've bought in the UK)
namibj: Do they make any promises about persistence of local NVMe after something like a full-region power outage yet? Because if you can't do durable commit on a single-region cluster that will be just temporarily unavailable without loosing committed data if something like that happened, it's not quite there unless you still stream a WAL to storage that they do promise you will survive a full blackout of all zones that store (part of) the data.
LunaSea: You already lose your data after instance restart so I think that full region outage is already out of question.
ody4242: Idk how an AWS region would respond to a power outage, but i have tested this in AWS Outpost, and there, if you power down a rack, then power it back again, the baremetal instances will not be recreated. (I was surprised as I was expecting the EC2 health check to terminate them, but it does not work like that.) My understanding is that if you stop/start an instance, your local storage is gone (as the instance might even end up in a different host), but if you just reboot the instance, it should keep the local storage.
tjoff: It will do real work fine. But slack and a browser will bring it to its knees.
skybrian: Maybe if you have 100 browser tabs or something silly like that?
alpaca128: A couple YouTube tabs are enough if you leave them running for long enough. Just one YT browser process will easily take up 1-4GB sooner or later.
NetMageSCW: Or it won’t because Chrome and MacOS will know how much RAM is available and manage it effectively.
antonyh: It would have been a better fit for me than the M4 Air, I literally use it only for typing and browsing, plus a could of Mac-only tools. Brilliant machine but complete overkill for me. It's almost tempting to switch just to get rid of the display notch.
scottlamb: > The cloud instances have network-attached disksProps for identifying the issue immediately, but armed with that knowledge, why not redo the benchmark on a different instance type that has local storage? E.g. why not try a `c8id.2xlarge` or `c8id.4xlarge` (which bracket the `c6a.4xlarge`'s cost)?
NetMageSCW: Why do you think people buying the cheapest MacBook available will be running Docket? Do you commonly run Docker containers on the cheapest Windows laptop available? Why not?
ajross: > I’ve been tempted to buy one and do “real dev work” on it just to show people it’s not this handicapped little machine.But... you can do the same exercise with a $350 windows thing. Everyone knows you can do "real dev work" on it, because "real dev work" isn't a performance case anymore, hasn't been for like a decade now, and anyone who says otherwise is just a snob wanting an excuse to expense a $4k designer fashion accessory.IMHO the important questions to answer are business side: will this displace sales of $350 windows machines or not, and (critically) will it displace sales of $1.3k Airs?HN always wants to talk about the technical stuff, but the technical stuff here isn't really interesting. The MacBook Neo is indeed the best laptop you can get for $6-700.But that's a weird price point in the market right now, as it underperforms the $1k "business laptops" (to avoid cannibalizing Air sales) and sits well above the "value laptop" price range.
prmph: No, you can't do real work on a $350 windows machine. No way such a setup is suitable for anything beyond browsing a tab or two and connecting to servers using SSH.And, the whole shittiness of the experience will even distract you attempting real work: the horrible touchpad, the bad screen, the forced windows updates when you trying to start the machine to do something urgent, ads in Windows, the lack of proper programmability of Windows (unless you use WSL).... Add the fact that the toy is likely to break in a year or two. These issue exist on far more expensive Windows machines, how much more a $350 machine.Leaving Windows machines and OS behind for more than a decade has been a continuing breath of fresh air. I have several issues with the Apple devices and macOS (as I have with Linux too), but on the whole they are far better than Windows. The only good thing about Windows that I miss on Macs is the file explorer and window management, not sure why Apple stubbornly refuses to copy those.
ajross: > No, you can't do real work on a $350 windows machine.Sigh. I mean, even absent the obvious answers[1], that's just wrong anyway. You're being a snob. Want to run WSL? Run WSL. Want to run vscode natively? Ditto. Put it on a cheap TV and run your graphical layout and 3D modelling work. I mean, obviously it does all that stuff. OBVIOUSLY, because that stuff is all cheap and easy.All the complaining you're doing is about preference, not capability. You're being a snob. Which is hardly weird, we're all snobs about something.But snobs aren't going to buy the Neo either. Again, the business question here is whether the $350 junk users can be convinced to be snobs for $600.[1] "Put Linux on it", "All of your stuff is in the cloud anyway", "It's still a thousand times faster than the machine on which I did my best work", etc...
NetMageSCW: You mean that machine from 30 years ago that was running 30 year old software that has nothing in common with today’s development? And how well does Linux run on 4GB?
NetMageSCW: It’s necessary because the ignorant keep saying 8GB of RAM is a deal breaking limitation on the cheapest MacBook available.
hrmtst93837: Trying DuckDB on lower-end Macbooks does show you dont need much muscle for moderate-size analytics. Long term it isnt cost-effective compared to budget laptops but its super simple for self-contained pipelines. The thing is 8GB RAM leaves you stuck once your data actually grows past the marketing demo.
NetMageSCW: Can’t give up and admit that 8GB of RAM is enough, can you?