benjojo.co.uk

benjojo posted 16 Aug 2024 16:40 +0000

Discovered that it is still possible to buy 8S systems, and honestly if I ever saw this thing in person I would be scared of it, get a load of this:

"be not afraid" etc

https://www.supermicro.com/en/products/system/mp/6u/sys-681e-tr

feuerrot@chaos.socia.. replied 16 Aug 2024 16:49 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/6B626P1lD5rq46663j

@benjojo ok, sure.
(I do really like the amount of vector graphics in SuperMicro documentation though)

benjojo replied 16 Aug 2024 16:50 +0000
in reply to: https://chaos.social/users/feuerrot/statuses/112972725705597255

@feuerrot "the design is very human"

manawyrm@chaos.socia.. replied 16 Aug 2024 16:54 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/7rT87656G15s3y31t7

@benjojo @feuerrot if the answer is 8 socket systems, the question is wrong 😹

kouett@soc.kouett.ne.. replied 16 Aug 2024 16:56 +0000
in reply to: https://chaos.social/users/manawyrm/statuses/112972745804688720

@manawyrm @benjojo @feuerrot I imagine maybe AI stuff... even then not sure this is necessary

erikk@chaos.social replied 16 Aug 2024 17:00 +0000
in reply to: https://soc.kouett.net.eu.org/objects/3ec7c548-b540-455f-a9c9-f217a7dc227a

@kouett @benjojo @feuerrot @manawyrm If you want large scale memory footprints for in memory databases going this big is often the only option. @work we have a few usecases for this but could lukly solve it with just a shit ton of SSDs as swap.

These boxes are mostly a direct competitor for Power based scale-up systems used for SAP and related large scale databases with the uptime requirments. And having only one box to manage is often espeically in the older style enterprise a big plus

erikk@chaos.social replied 16 Aug 2024 17:01 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972766541909694

@kouett @benjojo @feuerrot @manawyrm Now with CXL based memory expantion on the horizon i'm not seen a real market anymore for this tbh. Given with CXL or related stuff you can easly cram a a bunch of memory into one system as long as it fits in the address space.

manawyrm@chaos.socia.. replied 16 Aug 2024 17:03 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972771685374454

@erikk @kouett @benjojo @feuerrot Exactly. Use CXL and/or fix your applications.

Even dual socket is already a reliability nightmare in practice (at scale), having 8 points of failure (* memory channels, etc.) — yeesh.

erikk@chaos.social replied 16 Aug 2024 17:06 +0000
in reply to: https://chaos.social/users/manawyrm/statuses/112972777640656185

@manawyrm @kouett @benjojo @feuerrot fixing you application is often not really possible or rally expensive to do, or just not yet a solved problem. Often then just trowing money at boxes like this is the cheapest solution, and the failure modes of these boxes are way more predictible then large scale distributed systems.

Like most large scale destributed fail just as often, if you see the failure rates most large GPU clusters are dealing with....

erikk@chaos.social replied 16 Aug 2024 17:44 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972790095585821

@manawyrm @kouett @benjojo @feuerrot oh and with CXL you run in an other filter, the bloom filters ( or other cache coherency structures) are often to small so if you go to a dual socket or other distributed design you will still see a performance impact.

penguin42@mastodon.o.. replied 16 Aug 2024 17:11 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972766541909694

@erikk @kouett @benjojo @feuerrot @manawyrm It's only SAP isn't it???

erikk@chaos.social replied 16 Aug 2024 16:42 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/6B626P1lD5rq46663j

@benjojo you can go ever higher 🙃 But also whyy??? I guess this is just really one strange usecase some has for it

erikk@chaos.social replied 16 Aug 2024 16:51 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972696123454232

@benjojo https://buy.hpe.com/us/en/compute/mission-critical-x86-servers/compute-scale-up-servers/compute-scale-up-servers/hpe-compute-scale-up-server-3200/p/1014774076 those boxes go up to 32 Sockets (or 16 for 4th gen). The limiting factor on the 8S is only the cache coherency and snooping, but intel does offer the specs so you can build / design your own custom asics to offload this. A bit like NVlink switches.

benjojo replied 16 Aug 2024 17:01 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972730927707122

@erikk omg.

At this point the MTBF of any single CPU/DIMM is not enough to call this a machine that could possibly be reliable

erikk@chaos.social replied 16 Aug 2024 17:02 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/326VV3657wBGtx7wM9

@benjojo You run RAID on your dimms in those setups, so its fine if one fails or trips, you can just take one offline and swap one back in. Similarly with the CPUs those are all hot swappable and can to auto recovery.

erikk@chaos.social replied 16 Aug 2024 17:04 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972775046891768

@benjojo But compared to what IBM can do these are all rookie numbers, during hotchips they had a theoritical 4000 socket deployment 😨

ninkosan@infosec.exc.. replied 16 Aug 2024 17:48 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/326VV3657wBGtx7wM9

@benjojo @erikk Superdome…

jalict@mastodon.game.. replied 16 Aug 2024 20:35 +0000
in reply to: https://chaos.social/users/erikk/statuses/112972730927707122

@erikk @benjojo that thing has a shield in front of it. Be ready with your spear

kestral@hackers.town replied 16 Aug 2024 16:45 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/6B626P1lD5rq46663j

@benjojo that's a biblically acurate server right there.

penguin42@mastodon.o.. replied 16 Aug 2024 17:11 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/6B626P1lD5rq46663j

@benjojo Lenovo also do something like this still, I think as a pair of 4U(5U?) chassis that go together.

jamesog@mastodon.soc.. replied 16 Aug 2024 18:17 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/6B626P1lD5rq46663j

@benjojo Looks like a slightly beefier version of some Sun (SPARC) kit I used to run back in the day

Tenzer@s.waq.dk replied 16 Aug 2024 19:41 +0000
in reply to: https://mastodon.social/users/jamesog/statuses/112973071044264709

@jamesog @benjojo A T5440?

jamesog@mastodon.soc.. replied 16 Aug 2024 19:46 +0000
in reply to: https://s.waq.dk/users/Tenzer/statuses/112973400758883198

@Tenzer @benjojo Bigger! A pair of M4000s.

benjojo replied 16 Aug 2024 19:54 +0000
in reply to: https://mastodon.social/users/jamesog/statuses/112973420988979352

@jamesog mmmm quad socket SPARC64

jamesog@mastodon.soc.. replied 16 Aug 2024 19:59 +0000
in reply to: https://benjojo.co.uk/u/benjojo/h/WXx2vd3S9Ry7vbzNlw

@benjojo Those machines were pretty fun. The OOB was probably the best I've used. IIRC it was based on Linux/ARM. And a bit like a mainframe you could do either hardware partitioning or use its own hypervisor for logical partitioning.