> It's a hard-code compiler, not an interpreter written in Go. That implies some restrictions, but the documentation doesn't say much about what they are. PyPy jumps through hoops to make all of Python's self modification at run-time features work, complicating PyPy enormously. Nobody uses that stuff in production code, and Google apparently dumped it.
There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.
> If Grumpy doesn't have a Global Interpreter Lock, it must have lower-level locking. Does every built-in data structure have a lock, or does the compiler have enough smarts to figure out what's shared across thread boundaries, or what?
It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.
> Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.
What about stuff like literal_eval? Or even just monkeypatching with name.__dict__[param] = value ?
> It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.
Would there be a succinct theoretical description of exactly how that's implemented anywhere? What about things like numpy arrays.
> > Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.
> What about stuff like literal_eval? Or even just monkeypatching with name.__dict__[param] = value ?
literal_eval could in principle be supported I think. name.__dict__[param] = value works as you'd expect:
$ make run
class A(object):
pass
a = A()
a.__dict__['foo'] = 'bar'
print a.foo
bar
NumPy is a library that provides typed multidimensional arrays and functions that run atop them. It does provide a built-in LAPACK/BLAS or can link externally to LAPACK/BLAS, but that's a side effect of providing typed arrays and is nowhere near the central purpose of the library.
Also, NumPy is implemented completely in C and Python, and makes extensive use of CPython extension hooks and knowledge of the CPython reference counting implementation, which is part of the reason why it is so hard to port to other implementations of Python.
> I think it can be accomplished by defining the class with type()?
I've done it using more or less that method. The code is in the "coll" sub-package of my plib.stdlib project; the Python 2 version is here on bitbucket:
You won't get exact compatibility, but a metaclass implementation would give almost all the features. I can't remember what exactly you give up, but I did that once and I lost some introspection friendliness.
Are namedtuples that popular? They always felt awkward to me. If some temp variable with multiple values inside a loop, I either use normal tuple or a dict. If passing data around a dict or a real class. I never got the huge win from namedtuple?
namedtuples are tuples, meaning they are stored efficiently, and are constant (thus can also be used as dictionary keys). Unlike regular tuples, they can be accessed like a class/dictionary for readability, but requiring much less allocations (compared to dict/class), so much faster. Also, as they are tuples, you have well defined methods (printing, comparison, hash value, ) you'd have to implement yourself for dict/class.
If you like writing in functional style, namedtuples are much more natural than dict or classes, and more efficient to boot.
Attrs (https://attrs.readthedocs.io/) replaced namedtuple for us (and many others). It's slightly more verbose but allows all class goodness such as methods, attribute validation, etc.
Doesn't work for everything, but you can subclass a namedtuple:
from collections import namedtuple
class Foo(namedtuple("Foo", "a b c")):
@property
def sum(self):
return self.a + self.b + self.c
f = Foo(1,2,3)
print f.sum
We use them extensively in our API client code to pass back immutable, well-defined data structures. Dictionaries and classes are mutable and then each layer of code tends to sloppily change them however is convenient, meaning the underlying data can end up being represented differently in different code flows.
Namedtuples are a way to preserve the data unless the consuming code _really_ wants to change it, which is sometimes legitimate.
I'm not totally sold, as in some cases dictionaries or classes would add nice value. But namedtuples have a rigidity that makes you think twice before tampering with retrieved data.
In every introductory python course tuples are presented as just immutable lists. However a "more accurate" way of describing tuples is if you think of them as records with no field names. When you see tuples as records then the fact that are inmutable make sense, since the order and quantity of the items matters (it remains constant). Records usually have field names and here is where namedtuples comes in handy. Also helps to clarify what the tuples wear (see https://youtu.be/wf-BqAjZb8M?t=44m45s), just 2 minutes clip. If you are thinking why don't define a class, I will tell you a couple of reasons:
1) You know before hand that the number of items won't be modified and the order matters since you are handling records. So it is a simple way of accomplishing that constraint.
2) Because they extend tuple they are inmutable too and therefore they don't store attributes per instance __dict__, field names are stored in the class so if you have tons of instances you save a lot of space.
Why creating a class if you just probably need a read-only interaction? But what about if you need some method? Then you can extend your namedtuple class and add the functionality you want. If for example you want to control the values of the fields when you are creating the namedtuple you can create your own namedtuple by overriding __new__.
At that point it is worth it to take a look at https://pypi.python.org/pypi/recordclass.
Yeah, one of the motivations for adding namedtuple to stdlib was a drop-in compatible upgrade of existing interfaces returning tuples.
Notable atrocities included `time.localtime()` returning a 9-tuple, and `os.stat()` returning a 10-tuple...
> There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.
I'm guessing pretty much the entire AST module is a no-go?
I think the CPython AST module is written as a C extension module so currently it's a no-go. I don't think there's a fundamental reason Grumpy couldn't run a pure Python AST module, though.
The ast module itself is in Python, but it imports the _ast module which is an extension module. This actually isn't that big of a deal, though, as the entire AST is defined in a DSL (see https://cpython-devguide.readthedocs.io/en/latest/compiler.h... for some details), so you just have to write some code to generate _ast in Python instead of C (which PyPy may have already done).
I managed to run into 2 trying to build a 5 line program :-)
$ cat t.py; ./tools/grumpc t.py > t.go;go build t.go;echo '----';./t
import sys
print sys.stdin.readline()
----
AttributeError: 'module' object has no attribute 'stdin'
$
$ cat t.py ;./tools/grumpc t.py
c = {}
top = sorted(c.items(), key=lambda (k,v): v)
Traceback (most recent call last):
File "./tools/grumpc", line 102, in <module>
sys.exit(main(parser.parse_args()))
File "./tools/grumpc", line 60, in main
visitor.visit(mod)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
return visitor(node)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stmt.py", line 302, in visit_Module
self._visit_each(node.body)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stmt.py", line 632, in _visit_each
self.visit(node)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
return visitor(node)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/stin visit_Assign
with self.expr_visitor.visit(node.value) as value:
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
return visitor(node)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 101, in visit_Call
values.append((util.go_str(k.arg), self.visit(k.value)))
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/ast.py", line 241, in visit
return visitor(node)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 246, in visit_Lambda
return self.visit_function_inline(func_node)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/expr_visitor.py", line 388, in visit_function_inline
func_visitor = block.FunctionBlockVisitor(node)
File "/Users/foo/src/grumpy/build/lib/python2.7/site-packages/grumpy/compiler/block.py", line 432, in __init__
args = [a.id for a in node_args.args]
AttributeError: 'Tuple' object has no attribute 'id'
Basically, we needed to support a large existing Python 2.7 codebase. See discussion here: https://github.com/google/grumpy/issues/1
> It's a hard-code compiler, not an interpreter written in Go. That implies some restrictions, but the documentation doesn't say much about what they are. PyPy jumps through hoops to make all of Python's self modification at run-time features work, complicating PyPy enormously. Nobody uses that stuff in production code, and Google apparently dumped it.
There are restrictions. I'll update the README to make note of them. Basically, exec and eval don't work. Since we don't use those in production code at Google, this seemed acceptable.
> If Grumpy doesn't have a Global Interpreter Lock, it must have lower-level locking. Does every built-in data structure have a lock, or does the compiler have enough smarts to figure out what's shared across thread boundaries, or what?
It does fine grained locking. Mutable data structures like lists and dicts do their own locking. Incidentally, this is one reason why supporting C extensions would be complicated.