More

CJefferson · 2026-04-28T07:43:05 1777362185

So, this sounds exciting to me, but the postcode checker really feels like a spam as a user. All it tells me is 'Mixed results'. I could make a website that prints 'mixed results', I bet most results are 'mixed'!

I understand wanting to get money, but honestly, there is no way I would give money to this website in it's current state, you are giving me far too little info before asking me to hand over a credit card.

Then, if someone gives you £19, a crazy amount of money honestly, the last page of the report is an advert to give them 4 times more!

mebkorea · 2026-04-28T07:47:17 1777362437

Really useful feedback, cheers. Yeah, "Mixed results" is kinda rubbish as you say. It should give you something concrete before asking for anything. I'll fix that today. Fair point on the £79 upsell at the end of a £19 report too. That's tone deaf and I'll move it. On the £19... I'll think about it, but you're right the site needs to do more to justify the spend before pulling out a card. Appreciate the honest take!

CJefferson · 2026-04-28T07:53:09 1777362789

Just a quick follow up, if my reply seemed very harsh, view that as a sign of how enthusiastic I was to see the website at first. I understand wanting to make money, but I'd seriously consider giving a lot more away (maybe even the basic report stuff) away for free, I'd love to explore my local area, my parent's, be nosey what life is like in Oxford (a place I previously lived), but even if I was willing to pay (I'm not), having to stop, get PDF, download, really breaks the flow.

mebkorea · 2026-04-28T07:58:44 1777363124

No, that's absolutely a fair follow-up and not harsh at all. It's very helpful. The "be nosey about places you used to live" use case is exactly what the postcode tool should serve (thinking about it), and right now it doesn't. You're right that PDF-downloads break flow badly. Tbh... that's a hangover from the "people want a thing they can save" assumption that I'm still stuck in, I guess. I'm still on the fence about giving the paid reports away wholesale, but the gap between "tells you nothing" and "£19 PDF" is way too big. I'm gonna need a middle layer of free but actually useful exploration on the site. Will have a solid think about this today. Appreciate the feedback!

gnfargbl · 2026-04-28T08:02:36 1777363356

I'm also enthusiastic, it's not often you see people find a genuinely underserved niche and you have.

I don't know if I would pay £19 for a general state-of-the-area report. I would almost certainly have paid £100-300 for a service that took my planning application, critically reviewed it and told me which aspects were and were not likely to pass, with references to specific examples within my local area.

mebkorea · 2026-04-28T08:12:16 1777363936

Thanks, honestly that means a lot! Yeah, the pre-submission review idea is interesting and I've thought about it. I have the data to surface "applications similar to yours in your ward, here's what got approved and what didn't" but I haven't built it as a workflow because it requires the user to upload their plans... and that's a different kind of trust ask, but yeah, it is definitely worth revisiting. £100-500 is also a much more honest price for something that genuinely changes a decision. £19 is in the awkward "too much for curiosity, too little for stakes" zone you and the other commenter are both pointing at.

ramon156 · 2026-04-28T08:28:08 1777364888

Just checking, are you using an LLM to reply? Your replies are riddled with things LLMs are good at, like making quoted analogies that make no sense. They're not even analogies

pjc50 · 2026-04-28T08:15:52 1777364152

What benefit would people gain from the reports? Average rate of success/time is interesting, but I'm not sure what you'd do with this information other than a bit of local press discourse. I suppose it's nicely timed for the council elections?

mebkorea · 2026-04-28T08:29:23 1777364963

Honest answer... I don't fully know, zero paying customers so it's still very much a hypothesis. The two use cases I think hold up: (1) people pre-buying a house with extension potential, who otherwise guess or pay £500+ for a planning consultant; (2) homeowners about to commission £2-5k of architect drawings who want a sanity check before proceeding. Someone else suggested £100-500 for a proper pre-submission review which is probably better for that second case than my £19 report. The "general state-of-area" framing is the weakest one and you're right it's mostly local press discourse — that's marketing not revenue.

CJefferson · 2026-04-28T02:42:40 1777344160

They made a new format with basically no accessibility. We finally got latex usable by blind people with acceptable html output, I’m not moving to something worse.

xigoi · 2026-04-28T09:56:10 1777370170

In what ways is HTML produced by transpiling LaTeX more accessible that HTML produced by Typst?

CJefferson · 2026-04-28T10:12:00 1777371120

The HTML generated by LaTeX is currently very good (you can read basically every paper on arXiV in HTML). The html generated by Typst just.. doesn't really work currently. I checked a few weeks ago, tables didn't output sensibly. Looking at the docs their plan doesn't seem to be to aim for total coverage in general.

leephillips · 2026-04-28T14:07:45 1777385265

By “The HTML generated by LaTeX” do you mean by latexml (the tool used, I think, by arxiv) or something else?

CJefferson · 2026-04-26T06:56:57 1777186617

This is a really nice website!

In China it turns out there are lots of rule sets. The city I'm currently living in (Changsha) has it's own ruleset for example, with less tiles than these examples.

fortedoesnthack · 2026-04-26T10:04:02 1777197842

mahjong rulesets are wild. I play Japanese mahjong, and the difference between online and a mahjong parlor is quite different, making it interesting to see what people optimize for in those different settings

I think mahjong is probably "house rules the game" though. Pretty sure most mahjong hands probably just were a result of some guy being like "hey this hand looks like it should be scored man".

t-3 · 2026-04-26T10:45:30 1777200330

It's similar to dominos then - every region and cultural/ethnic group has their own variant, and every family has their own house rules. Or craps! I was so confused my first time playing in a casino after learning to play in the streets.

gs17 · 2026-04-26T18:59:01 1777229941

Yeah, the grad students near me always play Sichuan-style, because it's simpler (only numbered tiles, scoring is a bit easier).

djyde · 2026-04-26T14:21:59 1777213319

In my city, Guangdong, Cantonese mahjong doesn't allow chows, only pongs, and you need to self-draw to win.

CJefferson · 2026-04-24T05:12:02 1777007522

What's the current best framework to have a 'claude code' like experience with Deepseek (or in general, an open-source model), if I wanted to play?

deaux · 2026-04-24T05:45:27 1777009527

https://pi.dev/

TranquilMarmot · 2026-04-24T05:35:23 1777008923

https://opencode.ai/

whoopdeepoo · 2026-04-24T05:15:48 1777007748

You can use deepseek with Claude code

esperent · 2026-04-24T06:31:08 1777012268

You can, but does it work well? I assume CC has all kinds of Claude specific prompts in it, wouldn't you be better with a harness designed to be model agnostic like pi.dev or OpenCode?

rane · 2026-04-24T07:31:26 1777015886

I've been using all Kimi K2.6, gpt-5.4 and now Deepseek v4 (thought not extensively yet) in Claude Code and I can say it works much better than you'd expect. It looks like the system prompt and tools are pulling a lot of weight. Maybe the current models are good enough that you don't need them to be trained for a specific harness.

Alifatisk · 2026-04-24T06:30:49 1777012249

You can use CC with other models, you aren’t forced to use Claude model.

0x142857 · 2026-04-24T05:19:25 1777007965

claude-code-cli/opencode/codex

CJefferson · 2026-04-24T05:10:44 1777007444

To me, the important thing isn't that I can run it, it's that I can pay someone else to run it. I'm finding Opus 4.7 seems to be weirdly broken compared to 4.6, it just doesn't understand my code, breaks it whenever I ask it to do anything.

Now, at the moment, i can still use 4.6 but eventually Anthropic are going to remove it, and when it's gone it will be gone forever. I'm planning on trying Deepseek v4, because even if it's not quite as good, I know that it will be available forever, I'll always be able to find someone to run it.

muyuu · 2026-04-24T11:20:42 1777029642

Yep, it's wild how little emphasis is there on control and replicability in these posts.

Already these models are useful for a myriad of use cases. It's really not that important if a model can 1-shot a particular problem or draw a cuter pelican on a bike. Past a degree of quality, process and reliability are so much more important for anything other than complete hands-off usage, which in business it's not something you're really going to do.

The fact that my tool may be gone tomorrow, and this actually has happened before, with no guarantees of a proper substitute... that's a lot more of a concern than a point extra in some benchmark.

CJefferson · 2026-04-20T02:30:39 1776652239

So, I've just read a few dozen student reports, which I'm 95% sure were mostly generated by AI.

The problem isn't one page of one report. It's not even one whole report. But, the more you read, the more irritating it gets. It's hard not to notice the AIisms, and once you know them, it gets really obvious. And I know some people will say 'Oh, I say X', for any particular X, but the thing that people don't do is use some same construction at least twice a page, every page, forever.

Now, I can imagine there ends up being a bit of a battle, where AIs try to learn to write 'less AI', but for now, it's very obvious if you read enough AI generated stuff.

gruez · 2026-04-20T02:42:20 1776652940

>It's hard not to notice the AIisms, and once you know them, it gets really obvious.

Maybe I haven't read enough uber eats descriptions to notice, but at least from the sampling above it doesn't seem too obviously AI. There might be a lot of cliche wording, but it's not even clear whether it's worse than human reviews/descriptions.

CJefferson · 2026-04-17T01:03:31 1776387811

I'm a professor at uni, and this is what is happening -- many students are never really learning. Then they crash into exams at the end of term when they don't have their AI, and they bomb, I'm seeing failure rates like never before.

Now, part of me thinks 'is not letting students having AI like not letting them have a calculator'. On the other hand, if I just let the AI do the exam, well I don't really need the student at all do I?

koonsolo · 2026-04-17T11:36:14 1776425774

When kids learn calculation, they indeed are not allowed to use a calculator.

Same is true for your field now. When kids learn things the AI already knows, it's clear they can't use the AI.

If you want them to become smarter than the AI, they will have to pass through a period where they are dumber than the AI, and it's clear at that point they can't use it.

AI raised the bar, that's all. But it's still a bar that can be passed with human intelligence, and your job is to get them past that.

rogerrogerr · 2026-04-18T00:47:45 1776473265

> it's still a bar that can be passed with human intelligence

Can you expand on this?

koonsolo · 2026-04-18T09:40:35 1776505235

As a developer becomes better, they become better than an LLM, being able to deal with more complex things than what an LLM can handle. Some people will not be able to pass it, but others will.

When there will ever be AGI (I don't think this can be achieved with the current architecture, it needs another AI breakthrough), then we might not be able to surpass it, much like chess currently.

CJefferson · 2026-04-16T06:39:41 1776321581

Oh, that is cool, I’ve never seen that. I might add that to an extended version of the post sometime, I’ll be sure to credit you.

CJefferson · 2026-04-14T02:11:34 1776132694

The chances of significant bugs in lean which lead to false answers to real problems are extremely small (this bug still just caused a crash, but is still bad). Many, many people try very hard to break Lean, and think about how proofs work, and fail. Is it foolproof? No. It might have flaws, it might be logic itself is inconsistent.

I often think of the ‘news level’ of a bug. A bug in most code wouldn’t be news. A bug which caused lean to claim a real proof someone cared about was true, when it wasn’t, would in the proof community the biggest news in a decade.

CJefferson · 2026-04-12T23:23:33 1776036213

Yes, I've found tests are the one thing I need to write. I then also need to be sure to keep 'git diff'ing the tests, to make sure claude doesn't decide to 'fix' the tests when it's code doesn't work.

When I am rigourous about the tests, Claude has done an amazing job implementing some tricky algorithms from some difficult academic papers, saving me time overall, but it does require more babysitting than I would like.

Tuna-Fish · 2026-04-12T23:52:44 1776037964

Give claude a separate user, make the tests not writable for it. Generally you should limit claude to only have write access to the specific things it needs to edit, this will save you tokens because it will fail faster when it goes off the rails.

LelouBil · 2026-04-13T01:20:42 1776043242

Don't even need a separate user if you're on linux (or wsl), just use the sandbox feature, you can specify allowed directories for read and/or write.

The sandbox is powered by bubblewrap (used by Flatpaks) so I trust it.

eru · 2026-04-13T09:51:01 1776073861

You might want to look into property based testing, eg python-hypothesis, if you use that language. It's great, and even finds minimal counter-examples.