More

maslam · on Nov 3, 2022

No connection to the Akiflow team. I found the tool on my own and really like it. It’s snappy and does what it claims to do. I don’t really need their Calendly-lite implementation but everything else is very well put together.

sfavaro · on Nov 3, 2022

Thank you very much, Maslam! We really appreciate your support! If there is anything we can do to improve Akiflow for you, let us know.

maslam · on Nov 3, 2022

Well written and kind response. Thank you for doing this. We could use a bit more of this on HN.

mythhouse · on Nov 3, 2022

> The 'high' or 'dominant' castes make up more that more than 90% of Indian migrants as per a study in 2016.

> The backward castes of South Asia, known as Dalits, form 1.5% of all Indian immigrants to the United States

Calling me ignorant and a passive aggressive "forgive you" without actually addressing my comment isn't "kind".

GP came here with a stem degree worth millions of dollars in lifetime earnings , NOT $30 in pocket ( which was used as a proof of "not privileged" )

Grossly misrepresenting oneself and denying privilege is not kind.

complexmango · on Nov 3, 2022

Astounding to see racism and ignorance represented as virtue signaling so blatantly!

Are you aware that the majority of STEM seats in Indian colleges are set aside for either affirmative action for ‘lower’ castes (itself an arbitrary criteria) and women?

The average man in India is competing for approx 10% of college seats!

And saying an Indian STEM degree is worth millions in lifetime earnings reveals even more ignorance - outside of a handful of colleges, the India higher education system is considered woefully inadequate, underfunded, pumping out graduates who only succeed out of personal initiative.

Every major corporation in India essentially has a retraining program to actually teach what’s necessary for work. The worker training issue is one of the top challenges faced by businesses there.

Perhaps it’s time you described your own ethnicity and background etc. Let’s learn a little more about where your biases flow from.

If you actually grew up in India in the time period of the OP, or knew anything about it other than nonsense in woke publications, you would feel ashamed for your tone and envy and render a sincere apology to the OP.

random314 · on Nov 3, 2022

> Are you aware that the majority of STEM seats in Indian colleges are set aside for either affirmative action for ‘lower’ castes (itself an arbitrary criteria) and women?

And these seats are never filled and go to upper castes anyway?

My class was supposed to be 50% low caste with 25% Dalits in a class of 60. There were literally 2 Dalits in my class. And of late, the OBC categories are being crowded in by upper castes through political action.

A whole 50% of my class was actually "payment seats" where they paid double the fees to get seats with low entrance scores. Basically, a 50% quota for the privileged.

garfieldnate · on Nov 3, 2022

Curious non-Indian here, how did you know that there were exactly two Dalits in your class? Is this a thing people talk about? Were they open about it? Or do you just recognize them on sight?

random314 · on Nov 4, 2022

I am not a Hindu, so my caste radar is fairly weak. I cant tell caste by last name or by religious practices.

Having said that, People ask folks how much they scored in their entrance exam. Then it becomes obvious who were Dalits are - they had the lowest scores. The 2 of them were also fairly underprivileged, so you could guess from their less trendy and not as new clothes too.

50% were payment seats with about the same entrance rank as OBC students. So it was hard to tell the OBCs apart from the payment seat students. India actually has "affirmative action" for the rich in all private universities, which most of Indian H1Bs have graduated from. My parents couldn't afford these seats. The H1B upper castes wont tell you that, and will pretend that rich Dalits are taking over all the seats. No such thing is happening. I have never met a rich Dalit in my life. I am sure they exist, but I have never met them.

Their academic performance improves with time, and some of them are engineering managers at Microsoft now.

garfieldnate · on Nov 4, 2022

Okay, but to follow up, how do you know that some are engineering managers at Microsoft, now? Is this just something that gets around via gossip?

Also, is "caste radar" a word you made up, or is that a generally-used concept?

random314 · on Nov 4, 2022

> how do you know that some are engineering managers at Microsoft, now?

I am connected to those folks on LinkedIn.

> Also, is "caste radar" a word you made up, or is that a generally-used concept?

I made it up. But I have read content online, where Dalits feel a huge pressure when they are asked which temple they attend, or how they celebrate certain festivals. I don't know what these ritual differences are. Many folks pick up caste from last names, I can only pick up a small subset of them. What I meant to say was that - I can't tell the caste of a person as easily as Hindus. The Dalits I know, I surmised their caste from their entrance rank and many of them didn't hide it.

mythhouse · on Nov 3, 2022

I merely quoted some stats, stats are not racist. I read through your rant but didn't see an explanation of why only upper caste Indians are emigrating to usa. whats that about? How do only 30% of Indians make up 90+% indians in USA.

hardware2win · on Nov 3, 2022

>GP came here with a stem degree worth millions of dollars

Degree helps you at the beginning of the career

After you have exp and knowledge

It becomes less and less relevant

It is impossible to predict value of degree

Just like the value of connections

mythhouse · on Nov 3, 2022

> It is impossible to predict value of degree

Surely a stem degree and an opportunity to get a post graduate stem degree in a blazing hot tech market is worth more than $30.

Do you really think GP and a venezuelan asylee have same amount of privilege because they both came over with $30 in their pockets. Yet that was the logic GP used to justify his lack of privilege.

hardware2win · on Nov 3, 2022

All im saying is that you are probably giving degree too much value

You guys over there seem to be treating degree like the only way to better life

felix_n · on Nov 3, 2022

Read it over again Magoo: the subject is a specific kind of degree and how it opens doors for a specific industry in the Bay Area. A specific industry known for being insanely competitive and requiring almost comical qualifications to get in. Put the pieces together.

hardware2win · on Nov 3, 2022

I still stand with my point

Degree is not worth this much.

tstrimple · on Nov 3, 2022

Your opinion is contradicted by reality.

https://www.forbes.com/sites/michaeltnietzel/2021/10/11/new-...

hardware2win · on Nov 3, 2022

Correlation =/= causation

People with degrees, especially in stem are small (in compare to all ppl) group of people who are willing to put multi year, intense effort studying non trivial things

So yea, just those traits make them more likely to earn more

Go ahead and give degree to somebody without skills in good paying jobs and see whether that person will be milions ahead too

krageon · on Nov 3, 2022

A degree gets you in the door at all and starts you off with a higher salary (I would say significantly so) than someone with no comparable education. Those two things are a huge advantage, both in terms of money and in terms of the odds that you will be able to start making it at all.

hardware2win · on Nov 3, 2022

Thats what i wrote - degree makes your start easier, but with years of exp. it becomes less and less relevant

So the biggest salary gap is at the beginning and later everything is up to the skill

If you want fancy jobs like Jane Street then "just" degree from random school may not be enough

maslam · on May 24, 2022

Databricks PM here.

Backfills are on our roadmap We are previewing looking up failed jobs soon.

Email me at bilal dot aslam at Databricks dot com if you want more info

maslam · on Nov 26, 2021

Which one did you get?

maslam · on Nov 15, 2021

Everyone win when data platforms submit audited benchmarks...

maslam · on Nov 13, 2021

@AtlasLion you are right real world performance matters. We test extensively with actual workloads, and the speed up holds there too. For example: lots of real world BI queries are repeated over smallish data sets of 10 to 50 GB. We test that size factor and pattern all the time.

maslam · on Nov 13, 2021

Databricks broke the record by 2x) and is 10x more cost effective, in an audited benchmark. Snowflake should participate in the official, audited benchmark. Customers win when businesses are open and transparent…

mst · on Nov 13, 2021

Databricks and snowflake should pay an independent third party to re-run these. In-house benchmarks by either company don't count with results this different.

cmhill · on Nov 13, 2021

Databricks didn't run the Snowflake comparison in-house. From their article it says: "These results were corroborated by research from Barcelona Supercomputing Center, which frequently runs TPC-DS on popular data warehouses. Their latest research benchmarked Databricks and Snowflake, and found that Databricks was 2.7x faster and 12x better in terms of price performance."

dekhn · on Nov 13, 2021

I don't trust a supercomputer center to do a good job running a TPC benchmark (I do trust them to run LINPACK benchmarks).

jiggawatts · on Nov 13, 2021

Audited how? If you look at the Snowflake response the numbers being posted by Databricks look outright faked or otherwise false.

rxin · on Nov 13, 2021

There's an official TPC process to audit and review the benchmark process. This debate can be easiest settled by everybody participating in the official benchmark, like we (Databricks) did.

The official review process is significantly more complicated than just offering a static dataset that's been highly optimized for answering the exact set of queries. It includes data loading, data maintenance (insert and delete data), sequential query test, and concurrent query test.

You can see the description of the official process in this 141 page document: http://tpc.org/tpc_documents_current_versions/pdf/tpc-ds_v3....

Consider the following analogy: Professional athletes compete in the Olympics, and there are official judges and a lot of stringent rules and checks to ensure fairness. That's the real arena. That's what we (Databricks) have done with the official TPC-DS world record. For example, in data warehouse systems, data loading, ordering and updates can affect performance substantially, so it’s most useful to compare both systems on the official benchmark.

But what’s really interesting to me is that even the Snowflake self-reported numbers ($267) are still more expensive than the Databricks’ numbers ($143 on spot, and $242 on demand). This is despite Databricks cost being calculated on our enterprise tier, while Snowflake used their cheapest tier without any enterprise features (e.g. disaster recovery).

Edit: added link to audit process doc

_dark_matter_ · on Nov 13, 2021

Thanks for the additional context here. As someone who works for a company that pays for both databricks and snowflake, I will say that these results don't surprise me.

Spark has always been infinitely configurable, in my experience. There are probably tens of thousands of possible configurations; everything from Java heap size to parquet block size.

Snowflake is the opposite: you can't even specify partitions! There is only clustering.

For a business, running snowflake is easy because engineers don't have to babysit it, and we like it because now we're free to work on more interesting problems. Everybody wins.

Unless those problems are DB optimization. Then snowflake can actually get in your way.

rxin · on Nov 13, 2021

Totally. Simplicity is critical. That’s why we built Databricks SQL not based on Spark.

As a matter of fact, we took the extreme approach of not allowing customers (or ourselves) to set any of the known knobs. We want to force ourselves to build the best the system to run well out of the box and yet still beats data warehouses in price perf. The official result involved no tuning. It was partitioned by date, loaded data in, provisioned a Databricks SQL endpoint and that’s it. No additional knobs or settings. (As a matter of fact, Snowflakes own sample TPC-DS dataset has more tuning than the ones we did. They clustered by multiple columns specifically to optimize for the exact set of queries.)

geoduck14 · on Nov 13, 2021

>That’s why we built Databricks SQL not based on Spark.

Wait... really? The sales folks I've been talking to didn't mention this. I assumed that when I ran SQL inside my Python, it was decomposed into Spark SQL with weird join problems (and other nuances I'm not fully familiar with).

Not that THAT would have changed my mind. But it would have changed the calculus of "who uses this tool at my company" and "who do I get on board with this thing"

Edit: To add, I've been a customer of Snowflake for years. I've been evaluating Databricks for 2 months, and put the POC on hold.

alexott · on Nov 13, 2021

it's different - rxin talks about this: https://databricks.com/product/databricks-sql

when you run Python, it's on Spark, although you now can use Photon engine that is used for DB SQL by default

gibneyMI · on Nov 15, 2021

Credit to you for these amazing benchmark scores via an official process. You've certainly proved to naysayers such as Stonebreaker that lakes and warehouses can be combined in a performant manner!

Shame on your for quoting a fake non-official score for Snowflake in your blog post with crude suggestions to make it seem you're showing an apples-to-apples comparison.

I run a BI org in an F500 company that uses both Databricks & Snowflake on AWS. I can tell you that such dishonest shenanigans take away much from your truly noteworthy technical achievements and make me not want to buy your stuff for lack of integrity. Not very long ago, Azure+GigaOM did a similar blog post with fake numbers on AWS Redshift and it resulted in my department and a bunch of large F500 enterprises that I know moving away from Synapse for lack of integrity.

On many occasions, I've felt that Databricks product management and sales teams lack integrity (especially the folks from Uber & VMW) and such moves only amplify this impression. Your sales guys use arm-twisting tactics to meet quotas and your PM execs. are clueless about your technology and industry. My suggestion is to overhaul some of these teams and cull the rot - it is taking away from the great work your engineers and Berkley research teams are doing.

uvdn7 · on Nov 13, 2021

Snowflake claims the snowflake result from Databricks was not audited. It’s not that Databricks numbers were artificially good but rather Snowflake’s number was unreasonably bad.

jiggawatts · on Nov 13, 2021

Please also refer to my comment below on the value of the TPC audit process: https://news.ycombinator.com/item?id=29208172

maslam · on Nov 13, 2021

Hey jiggawatts - TPC is the official way to audit benchmarks in the database industry. They’ve been around for a bit, but let me know if you want more info, I’m happy to share more about them.

lmeyerov · on Nov 13, 2021

It sounds fundamentally busted if a competitor can submit benchmarks for someone else. TPC is great in general, but I didn't realize it had such a gaping flaw.

TPC submissions take real time/$/energy/expertise, so I don't know anyone who has ever done it casually. Ex: It was a multi-company effort for the RAPIDS community to get enough API coverage & edge case optimization for an end-to-end GPU submission on the big data one (SQL, ...), and even there the TPC folks made them resubmit if I remember right.

Also, note how the parent's response did not actually answer 'audited how'. Pushing the work to the questioner is on the shortlist of techniques studied by misinformation researchers. I'm a fan of both companies, so disappointing to see from a company rep.

rxin · on Nov 13, 2021

Check my reply, Leo.

lmeyerov · on Nov 13, 2021

The audit question is on Databricks marketing unaudited Snowflake TPC numbers. I do think Snowflake is big enough to run TPC, but how you guys choose to market is on you.

But: I think it's cool both companies got it to $200-300. Way better than years ago. Next stop: GPUs :)

rxin · on Nov 13, 2021

Ah ok. Wasn't clear. I think some repro scripts will be available soon.

Spivak · on Nov 13, 2021

The results are so crazy different that either Snowflake or Databricks are wrong or outright lying.

jiggawatts · on Nov 13, 2021

This is my point also, and I'm being downvoted for it.

If two people are in disagreement about the same facts, then one of them is either misinformed or lying. It's that simple.

If the only recourse seems to be to sink to the level of mud-slinging, with no clear ability to point to the audit trail and say "this is where it all went wrong", then it calls into question the value of that auditing process.

I'm personally unimpressed with the TPC process in general. I remember one "benchmark" that showed the performance of a 2RU server breaking some record, and it was a minor footnote that it was using a disk array with 7,500 drives in it -- dedicated to that one server for the duration of the test. That's an absurd setup that will never exist at any customer, ever.

I ran that same software myself on literally the exact same server, and it couldn't even begin to approach the posted TPC numbers on typical storage. It was at least two orders of magnitude slower.

The rub was that its inefficient usage of storage was the main problem, and the vendor was pulling a smoke & mirrors trick to hide this deficiency of their product. The TPC numbers were an outright fraud in this case, at least in my mind.

So to me, TPC looks like a staged show where the auditors are more like the referees in a WWE wrestling competition.

ttmahdy · on Nov 13, 2021

The TPC audit process tends to be thorough and strict.

Possibly you missed a configuration that was included in the Full Disclosure Report or Supporting Files?

The Databricks official, audited benchmark was executed against Databricks SQL which is a PaaS service that doesn't allow special tuning btw.

AtlasLion · on Nov 13, 2021

That doesn't allow end users any configuration, but this doesn't apply to the company itself which can apply settings from the background on behalf of end users.

jiggawatts · on Nov 13, 2021

I didn’t miss it. That doesn’t make it any less misleading.

maslam · on Oct 11, 2021

Yeah he’s a special person. His book is a joy to read (and learn from!)

maslam · on Oct 11, 2021

Hey Will I’m sorry it didn’t work out. We’ve actually adjusted really well to COVID (in no small part thanks to our amazing employees).

We care a lot about our hiring experience. I’d be happy to chat about this over email. Feel free to drop me a message at Bilal dot Aslam at databricks dot com.

ildon · on Oct 11, 2021

I think you replied to the wrong comment on hiring processes

solidangle · on Oct 11, 2021

The post was edited to remove comments about hiring

G3nD · on Oct 11, 2021

Is this supposed to be one of those Twitter style "clapbacks"? What a terrible look.

maslam · on Oct 11, 2021

I mean, not really? I’m just a product manager at Databricks who cares about our customers and candidates. Take that for what you will, I guess.

yawnr · on Oct 11, 2021

Why did you bring up hiring? His comment doesn’t even mention it. Unless he edited it.

solidangle · on Oct 11, 2021

The post was edited

atatatat · on Oct 11, 2021

Is Will a customer, or a former candidate?

maslam · on Sept 13, 2021

No. We respect our workers too much.