Hacker Newsnew | past | comments | ask | show | jobs | submit | rhdunn's commentslogin

Locally, I'm using gitolite+cgit. I was previously using Gitea, but that didn't suit my requirements.

I'm using GitHub for my open source projects as:

1. While GitHub Actions has its issues and doesn't work for everyone, I've found it easy to build and test an IntelliJ plugin against multiple IntelliJ versions.

2. I don't have to pay for and manage the hosting of the git repository.


Special Relativity is an extension of Galilean/Newtonian mechanics (motion of projectiles and other objects) to the case where the object is travelling at speeds that are a fraction of the speed of light. It deals with non-accelerating frames of reference. Satelites need to use this to correct for time dilation effects, but tracking the trajectory of an arrow/etc. or a car/etc. travelling from one location to another then classical mechanics is sufficient.

General Relativity is an extension of Newtonian gravity. It is also an extension of Special Relativity to cover accelerating frames of reference. Satelites need to use this, as does tracking the orbit of Mercury. However, for the orbits of other planets and the moon, using Newtonian gravity is sufficient for a reasonable degree of accuracy, and is used for tracking things like equinoxes/solstices, full moons, etc..


Blink is derived from WebKit, so is in the same family like the other Blink/WebKit derived browsers. Fireox/Gecko is a different browser implementation.

There are a few YouTube "can I solve [story] before the reveal?" style videos focusing on Agatha Christie novels ranging from around 4 years old to today.

My experience is that at Q5 and lower you start to see noticeable degredation in performance/quality. It's especially noticeable at Q4 where models will easily get trapped in repeating token loops. I generally use Q6.

[1] https://medium.com/@paul.ilvez/demystifying-llm-quantization...


Is your experience with this new quantization approach from Intel? Otherwise your comment is a bit offtopic at best, misleading at worst.

The comment in parenthesis mentions "they're not derived from a probability space" [1]. I don't know about probability spaces nor softmax to know what part of a probability space this is missing compared to other probability distributions, nor how other probability distributions satisfy probability spaces.

[1] https://en.wikipedia.org/wiki/Probability_space


Sounds like they're saying that since the distribution doesn't come from measuring or calculating the probability of something, it has the form of a probability distribution but isn't really one. Like saying 5 feet is a height that a person can have, but since I just made up that number it's not actually a person's height.

The soft max is the probability of the next token being whatever in the training data conditioned on the inputs. The author just doesn't know that apparently and thinks it was an arbitrary choice.

The author's essay on the sigmoid similarly lacks the deep understanding that it comes from somewhere and isn't an arbitrary choice.


The softmax, after the network has been trained, yields an estimate of the probability in the training data, but it is not that probability itself.

Which models are not trained with the log softmax as the loss function?

Softmax isn't a loss function. It is used to transform model outputs into positive numbers that sum to 1, so that they can be interpreted as probabilities, and then those numbers are passed into (typically) the cross entropy loss function. I think you mean, which models are trained using some function other than softmax to transform the model outputs. There are a number of alternatives to softmax, such as the ones described here https://www.emergentmind.com/topics/sparsemax

The cross entropy loss function is softmax. They are one and the same.

They’re not. Cross entropy loss is E[-log q] where q is a probability. You could convert the model outputs x into probabilities using some other function like q = 1/Z x^2, and compute cross entropy loss just fine.


Behold the actual definition of cross entropy: https://en.wikipedia.org/wiki/Cross-entropy

It's true that the PyTorch API conflates cross entropy and softmax, but they are separate concepts.


iirc, there is a bunch of formal machinery you need to define probability distributions for situations such as infinite outcomes (eg what is the probability that a random real number between 0 and 10 is less than 3?)

The repository names all look like two terms/words from dune (harkonen, mentat, ornithoptor, etc.) followed by a number. This would indicate that the account (possibly GitHub auth/actions token) has been compromised and then used to create the repository.

What's the usecase for this API?

My experience with running LLMs locally is spinnnig up llama-server (possibly on a separate machine) and then configuring other applications to point to that OpenAI compatible web server instead of OpenAI or similar.

I don't want a web browser creating/running an LLM instance as that machine may not have the capability or capacity to run an LLM instance.


If you want a GitHub-like UI (with org/repo structure limitations) use either Forgejo or Gitea.

If you want a similar but different experience use GitLab.

If you want something more akin to the kernel experience (i.e. hosting, flexible repository structure, user auth via ssh keys, and a simple web UI) use gitolite with cgit, or alternatively gitweb.


There's always gerrit.

I mean, technically it's a code review platform, not a complete toolbox like Gitlab and co, but damn if it isn't the most professional feeling experience.


I love gitea, and I use it for my homelab, but the permissions system needs a lot of work. There’s still an open bug which doesn’t let anyone but the repo owner read CI logs regardless of settings.

I used Gitea for a while, but eventually switched to gitolite+cgit. That was down to the org/repo structure not fitting my git hierarchy (I'm using a topic/repo, topic/subtopic/repo style structure) and the lack of organization/topic wide issue tracking/management.

or sr.ht, you can host it yourself if you want

One challenge is that model evaluation is typically domain/application specific. Model performance can also depend on the system prompt and the input/context.

Regarding evaluation, I've found using tools like promptfoo (and in some cases custom tools built on top of that) are useful. These help when evaluating new models/versions and when modifying the system prompt to guide the model. Especially if you can define visualizations and assertions to accurately test what you are trying to achieve.

This can be difficult for tasks like summarization, code generation, or creative writing that don't have clear answers. Though having some basic evaluation metrics and test cases can still be useful, and being able to easily do side-by-side comparisons by hand.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: