Hacker Newsnew | past | comments | ask | show | jobs | submit | ericchiang's commentslogin

This is a very small amount of boilerplate around the golang.com/x/net/html package. If you need the huge feature set of goquery, use that. But I find this pretty suitable for my day to day problems.


  rows := scrape.FindAll(table, scrape.ByTag(atom.Tr))
  cols := []*html.Node{}
  for _, row := range rows {
      // Find returns the first result
      col, ok := scrape.Find(row, scrape.ByTag(atom.Td))
      if ok {
          cols = append(cols, col)
      }
  }


Thanks!


> You can wip up REST service very easily that wraps sk-learn predictor and I would bet it's actually much easier to do than writing PMML exporters.

So as it turns out I spend my days building the very product you're describing (yhathq.com; a REST API-ifier for R and Python). The scikit-learn community alone are a wonderful group who do a hell of a job. It's kinda crazy that most products won't let you use that awesomeness and instead choose to build out their own machine learning libraries to work within their system.

This article got passed around the office this morning and it seems to encompass the general theme of most ML tools. They empower you to do cool things with machine learning/general data analysis, but at the expense of being able to use the libraries that most people use to do machine learning/general data analysis. Don't know if I'd consider that poor design, but yeah, it's definitely a tradeoff.

Hmm, maybe I should be reaching out to airbnb's data science team?


Probably the best explanation of the intuition behind Benford's law. Worth a watch if you've got the time:

https://www.youtube.com/watch?v=XXjlR2OK1kM


Hi there.

That ml problem is more for example than for rigor. In fact that particular problem would probably be better suited for other algorithms (eg, random forest).

My background's in biomedical imaging, so I'm quite fond of problems with skewed class distributions. Though I didn't have time to explore this particular one further.

The code's all openly available if you want to give it a go though :)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: