Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
If I could amend PEP 8 (kennethreitz.org)
46 points by voltagex_ on Jan 23, 2017 | hide | past | favorite | 66 comments


With regards to string quotes, I've always used this idiom, which I find fantastic:

For strings that are meant for human consumption, use double quotes, otherwise use single quotes.

This is super practical when looking strings meant for translation, end-user formatting, and such.


I love this. I've always followed this guideline without thinking about how to communicate it.

I'm fairly sure that single quotes are much more common in the UK in literature, though, so this only feels like it makes sense to NA and AU speakers.


> I'm fairly sure that single quotes are much more common in the UK in literature, though

It's just that the primary/secondary role is reversed. (This is also reflected in which one you have to hold Shift for on a computer keyboard.)


I like that Erlang more or less enforces that rule: strings are double-quoted, symbols are single-quoted, you generally use symbols for "programmatic strings".


[$i, $n, $d, $e, $e, $d].


ITYM: [$i|[$n|[$d|[$e|[$e|[$d]]]]]].


Yes "Indeed".


Funny, I came here to suggest the same thing. It's a good rule.


It's not a good rule because nobody else uses it, so it's just going to make the code inconsistent when someone else works on it.


All good rules started by nobody else using it. Though, in this case, at least three people discovered that rule independently.


And +1. I thought it's just me. It works quite well in practice and can be applied to many languages. Unfortunately in quite a few of them out prevents interpolation, so it's not a 100% solution.


> Unfortunately in quite a few of them out prevents interpolation, so it's not a 100% solution.

A bigger issue IME is languages (mostly of the C family) where single-quoting is for character literals.

Also in Smalltalk comments are double-quoted, but that's a pretty rare language to encounter.


A lot of people use it. That's how I was taught to write Python and that's how I wrote it in several different companies. At my current job we use double-quotes for everything though (for some reason).


I do. People I work with do. People I mentor do.


I don't agree with allowing characters to go over 80 - there is rarely a time the code can't be broken down to fit.

80 characters helps a lot since you can count on all code to fit within a certain area - so you can run three side/side text buffers in your text editor, instead of two with lots of useless white space because a few lines might be too long.

Doing a simple overview of the requests library, most of the lines over 80 characters are just laziness such as:

https://github.com/kennethreitz/requests/blob/master/tests/t...

And a few of the actual potential cases for going over 80 characters can still be broken down without losing any readability:

https://github.com/kennethreitz/requests/blob/master/request...

          r.headers['Proxy-Authorization'] = _basic_auth_str(self.username,
                                                             self.password)


I have lots of classes and method names that are descriptive but also make 80 chars harder to read. In past I've had to sacrifice readibility to follow strict 80 char limit rule it is a bad idea to enforce it. 99% of time you are not doing merges that require you to have 3 side by side buffers. It makes no sense to point this as the main reason to maintain the limit in my opinion. 100 char limit is way more sane, 95% of time 79 is enough for me but for the remaining 5% to enforce it means to actually make it less readable or make more cryptic variable names which is even worse imo..


I like descriptive names too, EvenToThePointOfAbsurdity, but if the 80 char limit is starting to be imposing, it's usually an indication that code should be refactored as it's being indented too much, or one thing is trying to do too many things.

3 side/side buffers is arbitrary, (it would be 2 on a small laptop display) - and maybe that's why I like 80 characters, it lets me develop on any screen without needing more real estate because the files are long.

I mean Angular has iirc 120 characters, and if you take a look at an average source file eg:

https://github.com/angular/angular/blob/master/modules/%40an...

You'll see a ton of white space after 80 characters that is 'wasted' - this also eats up into horizontal real estate of another buffer which in the end means you actually have less information on the screen with >80 characters rather than more.

In all cases - the code in that file could easily be broken into multiple lines and not lose any readability.


> * 3 side/side buffers is arbitrary, (it would be 2 on a small laptop display) *

It's about 1/3 of my 13" 2013 Macbook Air - the other 2/3 is usually a browser.

It's also not even about how much horizontal space there is for any reason - it just scans better if you keep it narrower.


> there is rarely a time the code can't be broken down to fit.

Yes, it can. It also might end up being: a) less readable, b) uglier

Resources are limited so I'd rather devote my efforts to fixing major styling problems rather than a line that goes over 3 characters

Having a hard limit is overblown nitpickness (and even the 1st line of Pep8 warns you against those)

Not to mention how to do line continuation involves different conflicting styles


Find an example in your code base where you had to go over 80 characters.

  find . -name '*.py' | xargs grep -n '^.\{80\}'
and post it here.


You "don't" need to go over 80 chars.

But it's stupid to break a line like this in two

stuff.do_something_funny(abc, def, ghi)

when the limit is reached at the 3rd parameter (or worse, at the parenthesis)


By that reasoning you could almost say 'find an example where you had to go over [insert arbitrary number between 50 and 100 or so]'. If the number if low enough or my screen is wide enough I could fit 4 text editor panes next to each other :] Any code could probably be broken up to match that as well.

Don't get me wrong, I personally think 80 is a fine number and it's what I use myself. But making things like this hard rules for what basically are arbitrary reasons (e.g. depending on screen resolution) doesn't seem right in my book. I'm seriously not even considering doing something about a line if it's 81 characters.


Try this on the python source itself and you'll find a lot of perfectly reasonable examples. One problem is that in a method, your first meaningful indent is actually three (12 spaces in) - two are eaten up in declaration ceremony. In a language with delimiters instead of significant whitespace you could just fiddle with the convention - for instance, you could decide that you're just not going to give up a full indent for toplevel class members. This isn't an option in Python.


> One problem is that in a method, your first meaningful indent is actually three (12 spaces in) - two are eaten up in declaration ceremony.

I think that's just as reasonable as the rest of Python's whitespace usage, since it (hopefully!) causes the programmer to consider whether that ceremony (classes, methods, etc.) is neccessary or just overcomplication.

It's directly comparable to, say, nesting lambdas in lambdas in lambdas; probably not great for readability, etc. It just-so-happens that in Python's OO, that first lambda is called a class and the second is called a method (objects are a poor man's closures, and closures are a poor man's object!)


For all the 'multi-paradigm' talk, it's still mostly an OO language and almost everything non-trivial written in python is structured in classes and methods. So that's 12 spaces with your first real indent. It's not the end of the world (nor do I think it's there to discourage one from using classes and methods) but it makes it easy enough to hit 80 chars to the point where a strict 80 column limit is a little too restrictive.


Not python, but in Java, identifiers can get quite long. e.g.: "localizableRUNTIME_MODELER_PORTNAME_SERVICENAME_NAMESPACE_MISMATCH" is the longest at 66 characters.

As for python, some fancy list comprehensions get rather long rather quickly. Especially if itertools are involved as you izip, etc. Or using useful variable names (i.e. not 'fcnt', 'tcnt') while destructuring as in this example from pyspark/mllib:

(failure_count, test_count) = doctest.testmod(globs=globs, optionflags=doctest.ELLIPSIS)

I adapted your grep to:

  find . -name '*.py' | xargs grep -n '^ *.\{80\}'
So leading indentation doesn't mess with the results.


Leading indentation is very much a part of the results especially in a language like python where the indentation is syntactically significant. You can't just decide to toss it to make a line shorter.


Preventing people making unreadable list comprehensions is exactly the sort of thing the 80 character limit is for ;)


> As for python, some fancy list comprehensions get rather long rather quickly.

Sure, and we all know that indentation in Python is not allowed for list comprehensions.

    class Foo:
        # ...
        def bar():
            # ...
            if baz:
                # ...
                result = [
                    y.strip()[1:14].replace("this", "that")
                    for x in nabla.generate(q, w, y, z)
                    for y in x.something_other()
                    if y.has_property(...)
                ]


But the 80 characters limit includes indentation


i just love that i had never thought of trying to find lines of code of a certain length this way and you probably just wrote it without a second thought.

hn is such a cool place. :)


I was in the camp of "I disagree with 79 character rule" when I started. However, as someone who has written lots of Python for work, I realized that it does make the code look more readable and while I used to do it initially out of compliance with the existing code, I now do it it because I like it.

Does anyone else feel this way? Or I like it because I had to do it and therefore had to like it in the process.


I was the same and of the opinion that this was mostly a relic of the past.

Then I was assigned a project where this was enforced in the test suite and had to religiously follow pep8. It took a while to get used to but after a while the benefits (like fitting into everyone's editor configuration, cleaner list comprehensions etc) became apparent.

Now I'm the one to put flake8 as a part of the test suite in any new python project :)


While 80 characters may be a bit restrictive, there is a need for a hard limit of something.

I've found that in the absence of a hard limit, you all too quickly end up with a codebase riddled with 250+ line monsters forcing you to scroll right and left all over the place just to get the faintest smidgen of an idea of what's going on. It's a maintenance nightmare.

Meanwhile, GitHub won't show lines longer than 132 characters (on Windows) or 123 characters (on the Mac) without scrolling, no matter how wide your screen.

I personally find 100 seems to strike the right balance.


If punch cards were 160 columns wide, we'd all be enforcing that as the maximum line width. It's arbitrary.

I typically go with 'about 1/3 of my screen width' so I can develop in multiple windows without horizontally scrolling. That's a lot more than 80 chars on a 4k screen.


> 80 characters helps a lot [...] so you can run three side/side text buffers in your text editor, instead of two with lots of useless white space because a few lines might be too long.

Add the fact that there still are people who print the code on paper out there.


very dependent on the language imo

C tends to be terse, so easy to imagine things under 80

python (due to lack of decent types/more verbose community) has more of a need for those 80 chars.

I think there's likely a per-project line limit that "makes sense".


I think this concern over an absolute limit on line length is overblown. Where I work now, we don't use a hard limit on line length. The rule is "keep it readable." That usually means lines less than 100 characters. I try to keep it below 80 myself, but I wouldn't complain about a 120 character line.

I did a search through our codebase for line lengths over 80 or 100 characters and found remarkably few. Most were comment lines.


In situations where the language allows either quote, I think any ounce of time spent by anyone debating to use one or the other is pure waste, including this comment.

Just do whatever you feel like. If your editor, linter, or review process makes you change it, those things are broken.


Inconsistency is not the end of the world but it hurts readability, slightly complicates find/replace and I find it just ugly.

The fact that the codes works is most important, but being nice to work on comes a close second.


Code that doesn't work but is readable can at least be fixed.


Agreed, "aligned with opening delimiter" is brain dead, you just waste time re-justify everything if you refactor. It's just more effort in the long run, and for what? It isn't prettier, more readable, or more convenient.

As for strings, meh. But consistency is nice. Line length exceeding an arbitrary standard, fine. Rules can be broken if there's a good reason.


> Agreed, "aligned with opening delimiter" is brain dead,

As a smug lisp weenie that has been working in python for many years, I completely disagree with this.

> you just waste time re-justify everything if you refactor.

A single `indent-region` in emacs and you are rejustified, no trouble.

> It's just more effort in the long run, and for what?

It's more effort as opposed to what, exactly?

> It isn't prettier, more readable, or more convenient.

It is, to me, prettier, but I think, objectively, it is more readable than alternatives, especially if there is any kind of nesting involved.

I concede that if there is nesting involved, you might be bumping into other python style issues, and generally might want to add some intermediate variables, etc; however, I do believe that in at least some cases this is entirely inconvenient, and so the lisp way of aligning parameters with their delimiter is the way to go in those cases. Anything else would be quite a bit less readable.


> > you just waste time re-justify everything if you refactor.

> A single `indent-region` in emacs and you are rejustified, no trouble.

Because when you refactor the function/class name, it's never just one region. So now you have to go back and indent all occurrences, maybe even over multiple files. Brilliant!


> Because when you refactor the function/class name, it's never just one region

... the region is whatever you say it is:

  (mark-whole-buffer)
  (indent-region)
Or, most likely, something like: C-x h TAB

I don't think that's particularly onerous.

It's only a few lines of elisp to visit all the files in a tree and automatically format them. That said, it would probably be more sensible to modify whatever refactoring tool to be indentation aware, if it isn't already. Again, that's easy to do with elisp.


> A single `indent-region` in emacs and you are rejustified, no trouble.

Yeah, I'll just change my text editor because someone in the team wants to use this rule.

So convenient.


Well if your text editor is not powerful enough, maybe it's a problem.


My text editor is powerful enough.


I hate that with a passion.

But I'm not clear what's being suggested in the article. Personally, I'd turn:

    foo = long_function_name(var_one, var_two,
                             var_three, var_four)
into:

    foo = long_function_name(
        var_one, var_two, var_three, var_four
    )
or:

    foo = long_function_name(
        var_one,
        var_two,
        var_three,
        var_four,
    )
depending on the actual length. (Or all on one line if short enough of course.)

"Aligned with opening delimiter" basically just means "put all the code as close to our column limit as possible", which is nuts.


What's your solution for function definitions for this?

  def very_long_function_name(arg_one, arg_two, arg_three, arg_four):
    ...


I believe it will be:

    def very_long_function_name(
        arg_one,
        arg_two,
        arg_three,
        arg_four
    ):
It's consistent and doesn't require spaces so it lets you use tabs to let the reader choose its indentation size.


I can't tell (might have got lost in the edit), but PEP8 suggests the arguments are more indented than the function body:

    def very_long_function_name(
            arg_one,
            arg_two,
            arg_three,
            arg_four
        ):
        pass
For data structures, I definitely prefer the bracket to be on it's own line, but for function definitions I don't, so I'm also a fan of:

    def very_long_function_name(
            arg_one,
            arg_two,
            arg_three,
            arg_four):
        pass

    def very_long_function_name(
            arg_one, arg_two, arg_three, arg_four):
        pass


> For data structures, I definitely prefer the bracket to be on it's own line, but for function definitions I don't, so I'm also a fan of:

    def very_long_function_name(
            arg_one, arg_two, arg_three, arg_four):
        pass
Please don't. It looks awful.


The same thing basically, sibling comment has it.

`):` always either on the first (and only) line or the last line, which ensures it either over-runs the function body indention level, or falls short:

    def a():
        pass

    def b(
    ):
        pass


Use the same principle as the last of OJFord's examples:

  def very_long_function_name(
      arg_one,
      arg_two,
      arg_three,
      arg_four
  ):
      ...
The nice thing here is that the close parenthesis separates the parameters from the function body, without having to do anything like double-indenting.

What's interesting to me is to try to understand why people are attracted to the column-aligned style even when it has so many problems. I think it comes directly from being unwilling to put spaces inside the parentheses when it's a one-liner:

  long_function_name(var_one, var_two, var_three, var_four)
If you find that line getting too long and want to break it into multiple lines, it's natural that the first thing you do is to turn spaces into newlines:

  long_function_name(var_one,
  var_two,
  var_three,
  var_four)
Well that's ugly, so what can we do with it? A few people indent the arguments after the first one:

  long_function_name(var_one,
      var_two,
      var_three,
      var_four)
That doesn't make much sense; why is the first argument not lined up with the rest? So the next natural thing to try is the column aligned style:

  long_function_name(var_one,
                     var_two,
                     var_three,
                     var_four)
And now you have the problems that brings; fiddly maintenance and excessive line lengths.

But what if you cultivate a style of putting spaces inside the parens, like this:

  long_function_name( var_one, var_two, var_three, var_four )
Now when you have to break it into multiple lines, the natural place to start is again to turn the spaces into newlines:

  long_function_name(
  var_one,
  var_two,
  var_three,
  var_four
  )
and from here it's simple to add some indentation:

  long_function_name(
      var_one,
      var_two,
      var_three,
      var_four
  )
What I haven't figured out is why so many programmers are opposed to putting spaces inside the parentheses. Not only does this lead to better practices when you switch back and forth between single line and multiline styles, but it's more logical too. In this example, the open paren "belongs" to the function call, not to the first argument. Why should the arguments get spaces between then, but the first argument is a special case, directly attached to the function name?

In fact, PEP8, if you take it as gospel, forbids spaces inside the parentheses. But as is common with these things, it gives no reason or rationale for this. It's simply listed as a "pet peeve".

I've seen a few style guides where the authors realized it would be nice to have some whitespace between the function name and first argument, but just couldn't bring themselves to try putting the space inside the parentheses, so they put it outside:

  long_function_name (var_one, var_two, var_three, var_four)
A lot of Unity C# code is written like this, because it's MonoDevelop's default style. It's not terrible when you see a simple example, but it gets pretty bad when there are nested functions:

  DoSomething (Foo (x), Bar (y))
The whitespace here has very little to do with the actual structure of the code. Contrast this with:

  DoSomething( Foo(x), Bar(y) )
Now the things that belong most closely together are visually connected, and spaces separate the things that are less connected. I didn't put spaces inside the parens for the inner functions, only the outer one, to help emphasize what is connected to what.


The one I don't like is the asymmetrical block-quotes in docstrings:

"""This line doesn't start at the start.

But this one does, and the following one ends at the start.

"""


Yeah, agreed. My multiline strings (docstring or otherwise) almost always look like

    """\
    This line starts at the start.
    And so do the following ones.
    """


For me having to read even the ugliest code is better than having to horizontally scroll. Anyone who thinks there is a single one line of code for which it's legit to go beyond 80 characters is either masochist or using a view with more than 80 characters in available width.


I use a proportional font so it's hard to know where the 80th character is. I just try to be reasonable. Btw, proportional fonts look much better. It took a week to adapt.

What I'll amend is the rule about not having spaces around = in argument lists. It looks horrible and it induces some people at omitting spaces in assignments. I've never seen so many var=value in source code as in Python, since the time of PHP.


Personally, I would have hard time with single quotes point.

Double quotes are just so much easier to catch visually.


Not with color syntax highlighting. I find double quotes unnecessary noise.


How do they suggest you indent function arguments? The Javascript way? Because I don't like that one at all.


This is my preference for function calls and defs that get too long, usually due to indentation or whatever.

  function_call(
      arg1, arg2, arg3
  )

  def function_call(
      arg1, arg2, arg3
  ):
      pass


noooo

):


>Line-length can exceed 79 characters, to 100, when convenient.

My issue with this is that, in reality, everything will now be 100 chars.


There are actually few places where lines need to be this long. If you keep indentation low (by refactoring) most of the lines don't get to 80.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: