Planet SymPy

July 02, 2009

Priit Laes

Basic PDE separation of variables implemented

For the most of the past week, I have been working on variable separation of PDE's and yesterday I sent my merge request. Currently two mostly-used strategies - additive and multiplicative separation are implemented and tested on simpler equations like Laplace's.

So, what's next? :)

I am currently prototyping a pdesolve() which can handle first-order equations (and also any hyperbolic PDEs) using characteristic methods. Once this is finished, I can start adding the variable separation into pdesolve() and after this has been done, I can move on to implment support for boundary and initial conditions.

by plaes at July 02, 2009 03:53 AM

June 30, 2009

Dale Peterson

Code cleanup and updating of examples

A few days ago, I discovered the .get() method of the dictionary data type. It allows you to request the value corresponding to a key which a dictionary may not have, in which case you can specify what should be returned instead of rasing a KeyError. In PyDy, I had numerous cases where I was using if/else statements to check if a dictionary had a certain value, but by using the .get() method, I was able to eliminate about 50 or 60 lines of code and make the code a lot more readable and easier to maintain.

The progress on the trigonometric functions has been slow. Ondrej and I worked on some code together that handles most of the cases in the table by Fu et al., but not all. One of the things that has been a little challenging is to identify pi shifts in the argument to the trig functions and map the shift into the first quadrant (0, pi/2) and return the appropriate result, e.g.: sin(x+17*pi/18) –> sin(x + pi/18), etc. The approach has been to match x and r in the following generic argument:

x + r*pi

As I see it, there are three cases to consider: 1) x==0, r!=0, 2) r==0, x!=0, and 3) x!=0, and r!=0. I am splitting the logic up in this fashion and am trying to deal with each case in the appropriate manner. In most cases, r will be of the Rational class, and therefore the modulo operator is needed to bring it to the interval (0, 2*pi), so I implemented the necessary code and sent a patch to the Sympy mailing.

Although the trig functions aren’t quite as they need to be yet, I was still able to manually get the simplifications that are presented by Fu et al. to work using the builtin Sympy commands, which hopefully will mean that simplifying the implementing the algorithm will be feasible with the existing Sympy tools and not too many extra functions will need to be created.

by admin at June 30, 2009 02:35 PM

Fabian Pedregosa

Preparing a new release

Last days I’ve been busy preparing the first public beta of SymPy 0.6.5. Most of the time was spent solving a bug that made documentation tests fail under python2.4, but now that this is solved, I hope that by the end of the week we could have a final release.

When this release is published, we’ll merge my query module and work on getting it right for 0.7.

by fabian at June 30, 2009 06:57 AM

Fredrik Johansson

Meijer G, more hypergeometric functions, fractional differentiation

My last update and the post before it detailed substantial improvements to the code for hypergeometric functions in mpmath, specifically the support for asymptotic expansions for 0F1, 1F1, 2F1, 2F0, plus the ability to evaluate hypergeometric-type formulas with singular parameters.

Over the past week-and-a-half I've done more work along the same lines. Importantly, I've implemented asymptotic expansions also for 1F2, 2F2 and 2F3 (commits 1, 2), so all hypergeometric functions of degree up to (2,3) now support fast evaluation for |z| → ∞ (1F0 also works, if anyone wonders -- it just happens to be a trivial case).

The next major remaining case is 3F2. It has a 1/z transformation, but this leaves |z| ≈ 1 which I don't know how to deal with. Does anyone who happens to be reading this know methods for evaluating 3F2 close to the unit circle? Taylor expansion around some point other than 0 works to some extent, but it's slow and in particular asymptotically slow close to z = 1, so not much help.

Bessel functions, etc.


As expected, the hypercomb function leads to very simple implementations of a large class of functions. I've now implemented the Whittaker, Struve, and Kelvin functions (they are called whitm, whitw, struveh, struvel, ber, bei, ker, kei). I've yet to update the orthogonal polynomials, but that shouldn't be much work. With this, I will have covered most of Bessel-type and hypergeometric-type functions listed on the Wolfram Functions site.

Speaking of Bessel functions, I also addressed most of the problems with their implementation in this commit. In particular, they can now be evaluated for huge arguments:


>>> mp.dps = 30
>>> print besselj(1, 10**20)
-7.95068198242545016504555020084e-11
>>> print chop(besselj(1, 10**20 * j))
(0.0 + 5.17370851996688078482437428203e+43429448190325182754j)
>>> print bessely(1,10**20)
-6.69800904070342428527377044712e-12
>>> print besselk(1,10**20)
9.66424757155048856421325779143e-43429448190325182776


This wasn't trivial, mainly because although hypercomb generally works, it sometimes becomes impossibly slow when computing functions straight from the definition. Basically: the function of interest might decrease exponentially, but internally it is computed by adding two nearly identical terms that grow exponentially, so the working precision and computation time increases exponentially. It's therefore still necessary to switch between different representations in different parts of the complex plane, and figuring that out involves some work. Some cases probably remain to be fixed.

As a followup to last week, I'll attach plots of J0(1/z3), Y0(1/z3) and K0(1/z3):













The K plot unfortunately took very long time to finish -- almost an hour for 200,000 complex evaluations (J and Y were both much faster), so I'll probably have to optimize besselk a bit further.

Fractional derivatives



Another useful enhancement to the Bessel functions is that they can now be differentiated and integrated directly:


>>> mp.dps = 30
>>> print besselj(0, 3.5, derivative=2)
0.419378462090785430440501275011
>>> print diff(lambda x: besselj(0,x), 3.5, 2)
0.419378462090785430440501275011
>>> print besselj(0,3.5,derivative=-1) - besselj(0,2.5,derivative=-1)
-0.244675206320579138611991019242
>>> print quad(lambda x: besselj(0,x), [2.5, 3.5])
-0.244675206320579138611991019242


In fact, the representation for the derivatives works not just for integer orders (including negative values -- giving iterated integrals), but also for fractional or complex values. This led me to implement a general function differint for computing fractional order derivatives / integrals of arbitrary functions. (Commit.) It works essentially directly with the definition of the Riemann-Liouville differintegral.

It gives the same result as the special-purpose implementation for the Bessel function:


>>> print besselj(0, 3.5, derivative=0.5)
-0.436609427860836504473775239357
>>> print differint(lambda x: besselj(0,x), 3.5, 0.5)
-0.436609427860836504473775239357


One neat application is iterated integration. The following gives a 5-fold integral of f(x) = exp(π x), along with the symbolic evaluation of the same as a check:


>>> print differint(lambda x: exp(pi*x), 3.5, -5, x0=-inf)
194.790546022218468869246881408
>>> print exp(pi*3.5) / pi**5
194.790546022218468869246881408


Does anyone have any other interesting applications for fractional differentiation? I'd be interested in more examples to test with and possibly add to the documentation.

The Meijer G-function


At last, the Meijer G-function is implemented! (Commit.) Personally, I think this is something of a milestone for the mpmath project.

The Meijer G-function is very important because of its role in symbolic definite integration. Basically, definite integrals of Meijer G-functions (and even products of Meijer G-functions) just yield new Meijer G-functions; Mathematica and Maple therefore do many integrals by rewriting the input in terms of Meijer G-functions, applying Meijer G transformations, and converting the result back to simpler functions if possible.

Having the Meijer G-function in mpmath should be useful for anyone who wishes to implement a more powerful definite integrator in SymPy for example. It could also be useful for obtaining numerical values from integrals done by hand.

Looking around for examples to do stress testing with, I found a web page by Viktor Toth: Maple and Meijer's G-function: a numerical instability and a cure. His problem is to accurately evaluate G(-; 0; -1/2,-1,-3/2; -; x) for large real values of x. With my Meijer G-function implementation, I get:


>>> mp.dps = 15
>>> print meijerg([[],[0]],[[-0.5,-1,-1.5],[]],10)
4.89717497704114e-5
>>> print meijerg([[],[0]],[[-0.5,-1,-1.5],[]],100)
1.09696661341118e-12
>>> print meijerg([[],[0]],[[-0.5,-1,-1.5],[]],1000)
0.0
>>> print meijerg([[],[0]],[[-0.5,-1,-1.5],[]],10000)
1.53249554086589e+54


The third value should probably be small but not quite zero, and the last value is clearly bogus. Without looking at the details, the cause is almost certainly catastrophic cancellation of two huge terms. Fortunately, there is a cure for this:


>>> print meijerg([[],[0]],[[-0.5,-1,-1.5],[]],1000, check_cancellation=True)
3.34093555343418e-33
>>> print meijerg([[],[0]],[[-0.5,-1,-1.5],[]],10000, check_cancellation=True)
2.43925769071996e-94


The cancellation check should probably be enabled by default, either to automatically redo the computation as above or at least to issue a warning. The only catch with this is that it might lead to unnecessary slowdown and/or annoyance for the user in some cases, so I'll have to investigate how common those cases are.

Incidentally, I also tried plugging the above two calculations into Mathematica 6.0. It takes a long time to finish (mpmath gives the result instantaneously) -- and returns values that are wrong!


In[1]:= u1 = MeijerG[{{},{0}},{{-1/2,-1,-3/2},{}},1000]

3 1
Out[1]= MeijerG[{{}, {0}}, {{-(-), -1, -(-)}, {}}, 1000]
2 2

In[2]:= u2 = MeijerG[{{},{0}},{{-1/2,-1,-3/2},{}},10000]

3 1
Out[2]= MeijerG[{{}, {0}}, {{-(-), -1, -(-)}, {}}, 10000]
2 2

In[3]:= Timing[N[u1,20]]

Out[3]= {8.90265, 0.0017597930166135139087}

In[4]:= Timing[N[u1,50]]

-33
Out[4]= {12.8231, 3.3409355534341801158987353523397047765918571151576 10 }

In[5]:= Timing[N[u2,20]]

50
Out[5]= {59.017, -2.0782671663885270791 10 }

In[6]:= Timing[N[u2,50]]

22
Out[6]= {83.3753, -2.8700325450226332558088281915945389986057044454640 10 }

In[7]:= Timing[N[u2,120]]

Out[7]= {451.365, 2.439257690719956395903324691434088756714300374716395499173\

-94
> 70196218529840153673260714339051464703903148052541923961351654 10 }



That's several minutes for something mpmath did in less than a second, and with less intervention. So maybe Maple and Mathematica should copy my Meijer G-function code ;-)

Complex roots



A small and trivial, but quite convenient, new feature: the nthroot function can now compute any of the roots of a given number and not just the principal root. I also added root as an alias since I have lazy fingers. Like so:


>>> for k in range(5):
... r = root(-12, 5, k)
... print chop(r**5), r
...
-12.0 (1.32982316461435 + 0.966173083818997j)
-12.0 (-0.507947249855734 + 1.56330088863444j)
-12.0 -1.64375182951723
-12.0 (-0.507947249855734 - 1.56330088863444j)
-12.0 (1.32982316461435 - 0.966173083818997j)


While I was at it, I couldn't resist also implementing a function unitroots for computing all the nth roots of unity, optionally just the primitive roots, as well as a function cyclotomic for evaluating the nth cyclotomic polynomial. As it turns out, cyclotomic polynomials are not entirely trivial to evaluate both efficiently and in a numerically stable way -- see the source code for my solution. Commit here. A side effect is that I also implemented a powm1 function that accurately gives xy - 1 (possibly a useful complement to expm1 for other uses as well):


>>> print power(2,1e-100)-1
0.0
>>> print powm1(2, 1e-100)
6.93147180559945e-101


That will be all for now.

by Fredrik Johansson (fredrik.johansson@gmail.com) at June 30, 2009 12:00 AM

June 29, 2009

Freddie Witherden

x Marks the Spot

There it is — the first 'equation' ever rendered by Mathtex! Although it may look like nothing more than a 99-DPI 12pt x in italicized Computer Modern it is really something quite special — a vision of progress.

Below is the parse-tree and glyph stream generated by the program:
freddie@fluorine ~/Programming/mathtex $ python main.py
[Hlist 9.42> [Hlist 0.00> ] [Hlist 9.42> `x` k1.17] [Hlist 0.00> ]]
[(-0.5, 7.0, Bunch(symbol_name=x, metrics=Bunch(advance=9.41821289062, iceberg=7.0, ymax=7.0, height=7.0, width=8.25, slanted=True, xmax=8.6875, xmin=0.5, ymin=0.0), num=120, fontsize=12, offset=0.0, postscript_name=Cmmi10, font=, glyph=))]


Over the last couple of days I have been working on the code that I committed last week (here for those that are interested) and as promised now have something that does work.

However, there are several unpleasantness associated with it: firstly it depends on mathtex.ft2font — the FreeType wrapper used by matplotlib; secondly there is currently only a Cairo backend; thirdly the only font series supported Computer Modern, by way of the Bakoma fonts; fourthly the font paths are currently hard-coded.

I plan to fix all of these issues over the next couple of days — starting with using font metrics files as opposed to FT2Font and then writing a C-based renderer and wrapping it using Cython. I expect that this will be done by Friday.

by Freddie Witherden (noreply@blogger.com) at June 29, 2009 07:56 PM

June 27, 2009

Freddie Witherden

This Week in GSoC Mathtex

Officially, according to my schedule this week should've been spent on producing a set of unit tests. Now, usually when things don't go to schedule it is because something bad or unexpected occurred.

However, while looking through the Mathtex code last week something good — but unexpected — occurred. It seems as if splitting Mathtex from Matplotlib is significantly easier than I first anticipated. Therefore, this week has been spent splitting the behemoth mathtex.py file in Matplotlib into several smaller files, ready for externalisation.

I expect that by Sunday or Monday the SVN repository will have a version of mathtex that is able to render equations using a Cairo backend and the FT2Font library from Matplotlib. Once this is working it shouldn't be to difficult to a) add a bitmap backend using FreeType/Cython/libpng; b) use font metrics files as opposed to FT2Font for metrics information.

On a personal note yesterday was also my last day in university accommodation. As of 22:30 BST I am now home again as opposed to being in central London. Yay for packing and unpacking!

by Freddie Witherden (noreply@blogger.com) at June 27, 2009 10:49 AM

June 26, 2009

Priit Laes

Separation of variables for PDEs

School finally ended last week (no exam results yet, though) and I took a long-long trip back home.. Week started slowly due to the deadly combination of the forgotten laptop power cord, Midsummer day, Victory Day and best weather ever. This resulted in lots of grilling, beer and also a small side-effect: my GSoC mentor is probably quite angry now for missing my weekly update again...

Back to Sympy now.

So far all my various doc-related fixes have been committed to upstream, and I have decided to bite the bullet and start implementing the variable separation for PDEs. So far some very simple testcases based on first-order equations work, but I have some trouble with equations containing higher order derivatives. You can pull from gihtub/pde-separate and please break it :)

by plaes at June 26, 2009 12:01 PM

June 25, 2009

Freddie Witherden

We Have a Mailing List

Following on from the Mathtex project announcement last week we now also have a mailing list. mathtex-dev; http://groups.google.com/group/mathtex-dev?lnk= which is open to all. Although a development list anyone with an interest in the project, should make their voices heard. This will almost certainly become more important in the next week or so when the floor is opened to feature/enhancement requests (backends and syntax support).

by Freddie Witherden (noreply@blogger.com) at June 25, 2009 11:57 AM

June 22, 2009

Dale Peterson

Still working on trigonometric.py

I am still working on modifying trigonometric.py to make all of the trig methods behave as they would in Mathematica/Matlab/Maple. This involves getting the .eval method of each of them correct, and then from there I can work on implmenting a new trigsimp.

by admin at June 22, 2009 11:30 PM

Fabian Pedregosa

Queries and performance

After some hacking on the queries module, I finally got it right without the limitations of past versions. You can check it out from my repo http://fseoane.net/git/sympy.git, branch master.

It now relies even more on logic.inference.satisfiable(), which is just an implementation of the DPLL algorithm. Bad news is that (my implementation of ) dpll_satisfiable() is SLOW, so inevitably queries are SLOW. But everything is not lost, since the algorithm is quite fast, and in fact other variants of the algorithm (MiniSAT) perform 6600x times faster than my implementation on medium-sized problems (60 variables, 170 clauses). So this looks like something smells bad on the programming side …

However, I spent the day profiling the function (link to source code used for profiling) without much success

by fabian at June 22, 2009 10:02 PM

Aaron Meurer

How to permanently lose data with git (and then retrieve it again)

So I pushed some changes to github so Ondrej could help me debug the nseries tests, when I noticed that the changes that I pushed had some bad comments. So I decided to rebase. But git rebase -i told me that there was already a rebase in progress. I figured that I [...]

by asmeurer at June 22, 2009 05:16 AM

June 20, 2009

Fabian Pedregosa

Reading CNF files

The DIMACS CNF file format is used to define a Boolean expression, written in conjunctive normal form, that may be used as an example of the satisfiability problem.

The new logic module (sympy.logic) can read the content of a cnf file and transform it into a boolean expression suitable for use in other methods.

For example, let quinn.cnf be a file with the following content:

c  an example from Quinn's text, 16 variables and 18 clauses.
c Resolution: SATISFIABLE
c
p cnf 16 18
  1    2  0
 -2   -4  0
  3    4  0
 -4   -5  0
  5   -6  0
  6   -7  0
  6    7  0
  7  -16  0
  8   -9  0
 -8  -14  0
  9   10  0
  9  -10  0
-10  -11  0
 10   12  0
 11   12  0
 13   14  0
 14  -15  0
 15   16  0

Then we can load the file and test for satisfiability:

In [1]: from sympy.logic.utilities.dimacs import load_file

In [2]: expr = load_file("quinn.cnf")

In [3]: from sympy.logic.inference import satisfiable

In [4]: satisfiable(expr)
Out[4]:
{cnf_1: True, cnf_11: False, cnf_12: True, cnf_13: True, cnf_14: False, cnf_15: False, cnf_16: True,
 cnf_2: False, cnf_3: True, cnf_4: False, cnf_5: True, cnf_6: True, cnf_7: True, cnf_8: True, cnf_9:
 True}

References: more on the DIMACS CNF file format
BUGS: For large files like this one it exits prematurely with an error in the DPLL algorithm.

by fabian at June 20, 2009 02:27 PM

June 19, 2009

Freddie Witherden

We Have a Project!

Today I got around to creating a Google Code project for Mathtex in preparation for the pending source code commits. The project can be found here: http://code.google.com/p/mathtex/

Don't expect the front page to say like that for long, however!

by Freddie Witherden (noreply@blogger.com) at June 19, 2009 08:03 PM

Fredrik Johansson

Massive hypergeometric update

[Update: as further proof that the asymptotic expansions are working, I've plotted 1/erfi(1/z3), Ai(1/z3) and Bi(1/z3) around 0:








End of update.]

Today I committed a large patch to mpmath that significantly improves the state of hypergeometric functions. It's the result of about a week of work (plus some earlier research).

Perhaps most importantly, I've implemented the asymptotic expansions for 0F1 and 1F1 (the expansions for 2F1 were discussed in the previous post). They should now work for arbitrarily large arguments, as such:


>>> from mpmath import *
>>> mp.dps = 25
>>> print hyp0f1(3,100)
786255.7044208151250793134
>>> print hyp0f1(3,100000000)
4.375446848722142947128962e+8675
>>> print hyp0f1(3,10**50)
4.263410645749930620781402e+8685889638065036553022515
>>> print hyp0f1(3,-10**50)
-2.231890729584050840600415e-63
>>> print hyp0f1(1000,1000+10**8*j)
(-1.101783528465991973738237e+4700 - 1.520418042892352360143472e+4700j)
>>> print hyp1f1(2,3,10**10)
2.15550121570157969883678e+4342944809
>>> print hyp1f1(2,3,-10**10)
2.0e-20
>>> print hyp1f1(2,3,10**10*j)
(-9.750120502003974585202174e-11 - 1.746239245451213207369885e-10j)


I also implemented 2F0 and U (Kummer's second function), mostly as a byproduct of the fact that 2F0 is needed for the asymptotic expansions of both 0F1 and 1F1. 2F0 is an interesting function: it is given by a divergent series (it converges only in special cases where it terminates after finitely many steps):



However, it can be assigned a finite value for all z by expressing it in terms of U. Hence, mpmath can now compute the regularized sum of 2F0(a,b,z) for any arguments, say these ones:


>>> print hyp2f0(5, -1.5, 4)
(0.0000005877300438912428637649737 + 89.51091139854661783977495j)
>>> print hyp2f0(5, -1.5, -4)
102.594435262256516621777


(This ought to be a novel feature; SciPy and GSL implement 2F0, but only special cases thereof, and Mathematica doesn't have a direct way to evaluate 2F0.)

It's the asymptotic case where z → 0, where a truncation of the 2F0 series can be used, that is used for the expansions at infinity of 0F1 and 1F1.

Back to 0F1 and 1F1, the asymptotic expansions of these functions are important because they permit many special functions to be evaluated efficiently for large arguments. So far I've fixed erf, erfc, erfi, airyai and airybi to take advantage of this fact (except for erf and erfc of a real variable, all these functions were previously slow even for a moderately large argument, say |z| > 100).

Examples that now work (well, some of them possibly theoretically worked before too, but probably required hours or ages of universe to finish):


>>> print erf(10000+10000j)
(1.000001659143196966967784 - 0.00003985971242709750831972313j)
>>> print erfi(1000000000)
2.526701758277367028229496e+434294481903251818
>>> print erfc(1000-5j)
(-1.27316023652348267063187e-434287 - 4.156805871732993710222905e-434288j)
>>> print airyai(10**10)
1.162235978298741779953693e-289529654602171
>>> print airybi(10**10)
1.369385787943539818688433e+289529654602165
>>> print airyai(-10**10)
0.0001736206448152818510510181
>>> print airybi(-10**10)
0.001775656141692932747610973
>>> print airyai(10**10 * (1+j))
(5.711508683721355528322567e-186339621747698 + 1.867245506962312577848166e-186339621747697j)
>>> print airybi(10**10 * (1+j))
(-6.559955931096196875845858e+186339621747689 - 6.822462726981357180929024e+186339621747690j)


An essential addition in this patch is the function hypercomb which evaluates a linear combination of hypergeometric series, with gamma function and power weights:



This is an extremely general function. Here is a partial list of functions that can be represented more or less directly by means of it:


  • Regularized hypergeometric series

  • The generalized incomplete gamma and beta functions (and their regularizations)

  • Bessel functions

  • Airy, Whittaker, Kelvin, Struve functions, etc

  • Error functions

  • Exponential, trigonometric and hyperbolic integrals

  • Legendre, Chebyshev, Jacobi, Laguerre, Gegenbauer polynomials

  • The Meijer G-function


That's most of Abramowitz & Stegun, and means that the remaining hypergeometric-type functions available in Mathematica or Maxima but absent in mpmath will be easy to implement in the near future. With these additions, mpmath will have the most comprehensive support for numerical hypergeometric-type functions of any open source software, and should be very close to Mathematica.

The most important virtue of hypercomb is not that it allows for more concise implementations of various hypergeometric-type functions, although that is a big advantage too. The main idea is that hypercomb can deal with singular subexpressions, and particularly with gamma function poles that cancel against singularities in the hypergeometric series. These cases are almost more common than the nonsingular cases in practice, and hypercomb saves the trouble of handling them in every separate function.

Thus, in principle, a numerically correct implementation of hypercomb leads to correct implementations of all the functions in the list above. It's not a silver bullet, of course. For example, if a particular but very common case of some common function triggers an expensive limit evaluation in hypercomb, then it's probably better to handle that case with special-purpose code. There are also most likely some bugs left in hypercomb, although by now I have tested a rather large set of examples; large enough to be confident that it works soundly.

To show an example of how it works, an implementation of the Bessel J function might look like this:


def besj(n,z):
z = mpmathify(z)
h = lambda n: [([z/2],[n],[],[n+1],[],[n+1],-(z/2)**2)]
return hypercomb(h, [n])

>>> mp.dps = 30
>>> print besselj(3,10.5)
0.1632801643733625756462387
>>> print besselj(-3,10.5)
-0.1632801643733625756462387
>>> print besj(3,10.5)
0.1632801643733625756462387
>>> print besj(-3,10.5)
-0.1632801643733625756462387


It gives the same value as the current Bessel J implementation in mpmath. Note that it works even when n is a negative integer, whereas naively evaluating the equation defining J(n,z) hits a gamma function pole and a division by zero in the hypergeometric series.

Thanks to the support for asymptotic expansions, this implementation (unlike the current besselj will also work happily with large arguments):


>>> print besj(3,1e9)
0.00000521042254280399021290721662036


I'm soon going to fix all the Bessel functions in mpmath along these lines, as I already did with erf, airyai, etc.

Here is another, more involved example (quoting directly from the docstring for hypercomb). The following evaluates



with a=1, z=3. There is a zero factor, two gamma function poles, and the 1F1 function is singular; all singularities cancel out to give a finite value:


>>> from mpmath import *
>>> mp.dps = 15
>>> print hypercomb(lambda a: [([a-1],[1],[a-3],[a-4],[a],[a-1],3)], [1])
-180.769832308689
>>> print -9*exp(3)
-180.769832308689


With some tweaks, the same code perhaps with a few tweaks could be used to symbolically evaluate hypergeometric-type functions (in the sense of rewriting them as pure hypergeometric series, and then possibly evaluating those symbolically as a second step). Symbolic support for hypergeometric functions is a very interesting (and hard) problem, and extremely important for computer algebra, but unfortunately I don't have time to work on that at the moment (there is more than enough to do just on the numerical side).

by Fredrik Johansson (fredrik.johansson@gmail.com) at June 19, 2009 07:37 PM

Fabian Pedregosa

Logic module merged

Yesterday I finally merged the logic module in sympy’s official master branch, and should be released together with SymPy 0.6.5.

Next thing to do: profile the code and write some docs before the release.

by fabian at June 19, 2009 10:23 AM

June 18, 2009

Freddie Witherden

Rendering and Backends

As well as familiarising myself with the existing Mathtex code I have also been considering how to actually go about rendering parsed expressions. Matplotlib currently has a very rich collection of backends which all provide significantly more functionality than required by Mathtex — which only needs glyph setting and line drawing.

One of the outstanding issues, however, is that of FreeType. Although there are bindings for virtually every other programming language there are currently none for Python. (There are, however, two failed attempts.) As a result of this Matplotlib includes its own wrapper FT2Font which is C++ based.

In the current implementation of Mathtex all backends use this FreeType wrapper to get glyph metrics (width, height, advance, etc). Some backends then go on to use FreeType for the rendering, while others (such as PDF and SVG) do not. As glyph metrics are, for the most part, invariant I have been considering putting them in a table as opposed to reading them from the font file each time. This is similar to how TeX operates.

The immediate consequence of this would be that FreeType would not be a hard dependency — if one wishes to only produce PDF/SVG files (or use backends which use FreeType indirectly). However, it might lead to reduced rendering quality for bitmapped backends. (I am still investigating this; the reason being that FreeType provides hinted metrics, while a table would not.)

Assuming that there is no perceivable difference then it is likely that for most of the default fonts a look-up table will be produced. This will allow for separation of the parsing/rendering stages: one piece of code can parse the expression and produce a list of glyphs (at specific sizes, styles and locations) and another independent piece of code can then go onto render it. Furthermore this would make it easy for people to make use of Mathtex in their own applications, but just asking for a stream of glyphs/drawing ops and then rendering it themselves.

For the bitmapped backend (arguably the most common) it is likely to be written in C and abstracting away all of the FreeType/compositing operations — which are more natural in C than Python — and then using Cython to create a Python wrapper for it.

If both of these prove viable then they will serve the purpose of sidestepping the FreeType + Python issue entirely. Of course, I think most agree that a well maintained FreeType wrapper for Python is the way to go (and is something I will consider if I have time at the end of my project).

Otherwise my current plan is to use FT2Font from Matplotlib as a wrapper around FreeType. Expect an answer in the next couple of days :)

by Freddie Witherden (noreply@blogger.com) at June 18, 2009 08:17 PM

June 17, 2009

Dale Peterson

NSF Success!

Today I found out that our NSF proposal to the control systems division was accepted! This means I will be funded for two years to study and work bicycles and control! This is a dream come true for me and I am very excited about it.

Soon, I hope to use PyDy to derive the model of the bicycle for our studies. Once trigsimp() is made better, it will be fully capable of this :)

by admin at June 17, 2009 07:19 PM

June 16, 2009

Priit Laes

GSoC Week 2/3

As I have still exam session in progress, I haven't had much motivation to write about my almost non-existent progress with GSoC.

So, school first: I have still one exam left - Optics, which is on 19th and will be hard as some of the stuff we are required to know does not even exist (tunnel effect) in the Hecht's "Optics". (Although I might be wrong, as I only have 2nd edition and later editions might have updates). Fortunately I have almost 100% of the lecture notes available. After the exam, either on Friday or Saturday, I will be moving back home (which is ~1 day of travelling) and then I can really start with GSoC.

Now, GSoC..

As some people have had trouble connecting to my personal server (Who said that free countries do not have internet censorship?), I created a mirror of my private Sympy repo on github. There you can find the following two branches I consider upstream quality:

  • py3k - some preliminary Python3K porting, which gets rid of all the dict.has_key() that were reported during test phase. (So some still remain).

  • doc-fixes - I ran into few problems while tried to build Sympy's documentation - most of them were Sphinx warnings, although ran into some errors too. So please review and comment.

There's also my pde-wip branch, but currently I'm not very happy with that approach...

Thanks Ondrej for pushing me to keep posting :)

by plaes at June 16, 2009 11:33 AM

Dale Peterson

sin, cos, tan, cot, sec, csc

In my trigsimp branch of Sympy, I’ve modified all the .eval methods of sin(), cos(), tan(), and cot() so that they give the same behavior of Mathematica. Next on the list is to implement sec() and csc() in a similar fashion. Currently sec() and csc() do not exist in Sympy, so I will have to familiarize myself with all the other code that is in the trig functions that do exist. It seems to me that much of the code is redundant and may not be tested for, so I will attempt to do some housecleaning in that section while I am at it. The rule based trig simplification should then work fine, I have already started some basic tests of applying the rules that Fu et al. show in their examples, and Sympy handles them just fine.

Ondrej also helped me get trigsimp branch up and running on github:
git://github.com/hazelnusse/sympy.git

My trigsimp branch is there and I have created an issue for it here:
http://code.google.com/p/sympy/issues/detail?id=1475

by admin at June 16, 2009 02:29 AM

June 15, 2009

Fabian Pedregosa

The boolean satisfiability problem

Most annoying problem in my implementation of the query system is that it will not solve implications if the implicates are far away from each other. For instance, if the graph of known facts is something like this

 Integer ----> Rational --> Real --> Complex
   ^  ^
   |  |
   |   -------
   |         |
 Prime      Even
   ^
   |
   |
 MersennePrime

Then it will not know how to handle the query: Is x complex assuming it is a Mersenne prime ?. This is because the vertices MersennePrime and Complex are far away from each other and the query function does not load the complete graph of known facts, but rather a small subgraph centered on the assumed facts …

This was done so for efficiency reasons, because in the initial implementation I feared that the graph of known facts could become huge and thus making it unfeasible to search into.

But things have changed now. Known facts is not huge at all, roughly having over 20 vertices, so it is feasible to build the complete graph the first time query() is called and store it for future uses. And, most important, we have implemented fast algorithms for the problem of boolean satisfiability (DPLL under sympy.logic.algorithms.dpll), so all is ready to implement these ideas in the following days.

Interestingly, there seems to me many open source libraries for solving this problem. One that caught my attention early is MiniSAT, a nice little program written in C++ which is really fast

by fabian at June 15, 2009 04:00 AM

June 12, 2009

Freddie Witherden

Project Update

Sorry for the lack of updates over the last couple of weeks — however I have had my first year undergraduate exams this week and was revising the week prior. With my exams now out of the way I am able to start working full-throttle on my GSoC project.

According to my schedule (which I will post verbatim in a later post) the plan it to spend the first week getting up to speed on exactly how the TeX layout algorithm works and how it is currently implemented in Matplotlib. If all goes to plan you can expect a diagram-filled blog post on it near the end of next week.

My mentors, John Hunter and Michael Droettboom have given me some good reading material on it (along with the current implementation, of course) so this should not be too difficult.

Finally, sometime in the near future you can also expect a discussion on backends (so the code responsible for turning a list of characters and coordinates into an image/document).

by Freddie Witherden (noreply@blogger.com) at June 12, 2009 08:14 PM

Fabian Pedregosa

Initial implementation of the query system

I sent some patches to sympy-patches with an initial implementation of the query system.

You can check it out by pulling from my branch:

git pull http://fseoane.net/git/sympy.git master

into your sympy repo.

Some examples of what you can do (sample isympy session):

In [1]: query(x, positive=True)

Returns None, as we do not know whether x is positive or not.


In [2]: query(abs(x), positive=True)
Out[2]: True

because abs() is always positive. Because exp() is always positive, the following should also be True:


In [3]: query(exp(x), positive=True)

but why then does it return None ?. Well, it simply is not True that exp() is always positive, it is always positive for real values, but SymPy does not assume that x is real, so you would have to specify that. This is now done with the keyword assumptions:


In [5]: query(exp(x), positive=True, assumptions=Assume(x, real=True))
Out[5]: True

As you can see, now assumptions are independent objects and are not tied to symbols any more. For more examples, see the file sympy/query/tests/test_query.py

still in the TODO list:
- support for global assumptions
- solve more complex implications, like query(x, positive=True, assumptions=Assume(x, even=True)) where it should build the chain of implications even => integer => rational => real. This chain of implications currently stops at rational for efficiency reasons, because the number of facts grows in each step which makes the number of possible paths grow exponentially.

by fabian at June 12, 2009 04:36 AM

June 11, 2009

Fredrik Johansson

Hypergeometric 2F1, incomplete beta, exponential integrals

One of the classes of functions I'm currently looking to improve in mpmath is the hypergeometric functions; particularly 1F1 (equivalently the incomplete gamma function) and the Gauss hypergeometric function 2F1.

For example, the classical orthogonal polynomials (Legendre, Chebyshev, Jacobi) are instances of 2F1 with certain integer parameters, and 2F1 with noninteger parameters allows for generalization of these functions to noninteger orders. Other functions that can be reduced to 2F1 include elliptic integrals (though mpmath uses AGM for these). With a good implementation of 2F1, these functions can be implemented very straightforwardly without a lot of special-purpose code to handle all their corner cases.

Numerical evaluation of 2F1 is far from straightforward, and the hyp2f1 function in mpmath used to be quite fragile. The hypergeometric series only converges for |z| 1, and rapidly only for |z| 1. There is a transformation that replaces z with 1/z, but this leaves arguments close to the unit circle which must be handled using further transformations. As if things weren't complicated enough, the transformations involve gamma function factors that often become singular even when the value of 2F1 is actually finite, and obtaining the correct finite value involves appropriately cancelling the singularities against each other.

After about two days of work, I've patched the 2F1 function in mpmath to the point where it should finally work for all complex values of a, b, c, z (see commits here). I'm not going to bet money that there isn't some problematic case left unhandled, but I've done tests for many of the special cases now.

The following is a very simple example that previously triggered a division by zero but now works:

>>> print hyp2f1(3,-1,-1,0.5)
2.5


The following previously returned something like -inf + nan*j, due to incorrect handling of gamma function poles, but now works:

>>> print hyp2f1(1,1,4,3+4j)
(0.492343840009635 + 0.60513406166124j)
>>> print (717./1250-378j/625)-(6324./15625-4032j/15625)*log(-2-4j) # Exact
(0.492343840009635 + 0.60513406166124j)


Evaluation close to the unit circle used to be completely broken, but should be fine now. A simple test is to integrate along the unit circle:


>>> mp.dps = 25
>>> a, b, c = 1.5, 2, -4.25
>>> print quad(lambda z: hyp2f1(a,b,c,exp(j*z)), [pi/2, 3*pi/2])
(14.97223917917104676241015 + 1.70735170126956043188265e-24j)


Mathematica gives the same value:

In[17]:= NIntegrate[Hypergeometric2F1[3/2,2,-17/4,Exp[I z]],
{z, Pi/2, 3Pi/2}, WorkingPrecision->25]
-26
Out[17]= 14.97223917917104676241014 - 3.514976640925973851950882 10 I


Finally, evaluation at the singular point z = 1 now works and knows whether the result is finite or infinite:

>>> print hyp2f1(1, 0.5, 3, 1)
1.333333333333333333333333
>>> print hyp2f1(1, 4.5, 3, 1)
+inf


As a consequence of these improvements, several mpmath functions (such as the orthogonal polynomials) should now work for almost all complex parameters as well.

The improvements to 2F1 also pave the way for some new functions. One of the many functions that can be reduced to 2F1 is the generalized incomplete beta function:



An implementation of this function (betainc(a,b,x1,x2)) is now available in mpmath trunk. I wrote the basics of this implementation a while back, but it was nearly useless without the recent upgrades to 2F1. Evaluating the incomplete beta function with various choices of parameters proved useful to identify and fix some corner cases in 2F1.

One important application of the incomplete beta integral is that, when regularized, it is the cumulative distribution function of the beta distribution. As a sanity check, the following code successfully reproduces the plot of several beta CDF:s on the Wikipedia page for the beta distribution (I even got the same colors!):


def B(a,b):
return lambda t: betainc(a,b,0,t,regularized=True)

plot([B(1,3),B(0.5,0.5),B(5,1),B(2,2),B(2,5)], [0,1])




The betainc function is superior to manual numerical integration because of the numerically hairy singularities that occur at x = 0 and x = 1 for some choices of parameters. Thanks to having a good 2F1 implementation, betainc gives accurate results even in those cases.

The betainc function also provides an appropriate analytic continuation of the beta integral, internally via the analytic continuation of 2F1. Thus the beta integral can be evaluated outside of the standard interval [0,1]; for parameters where the integrand is singular at 0 or 1, this is in the sense of a contour that avoids the singularity.

It is interesting to observe how the integration introduces branch cuts; for example, in the following plot, you can see that 0 is a branch point when the first parameter is fractional and 1 is a branch point when the second parameter is fractional (when both are positive integers, the beta integral is just a polynomial, so it then behaves nicely):


# blue, red, green
plot([B(2.5,2), B(3,1.5), B(3,2)], [-0.5,1.5], [-0.5,1.5])




To check which integration path betainc "uses", we can compare with numerical integration. For example, to integrate from 0 to 1.5, we can choose a contour that passes through +i (in the upper half plane) or -i (in the lower half plane):


>>> mp.dps = 25
>>> print betainc(3, 1.5, 0, 1.5)
(0.152380952380952380952381 + 0.4023774302466306150757186j)
>>> print quad(lambda x: x**2*(1-x)**0.5, [0, j, 1.5])
(0.152380952380952380952381 - 0.4023774302466306150757186j)
>>> print quad(lambda x: x**2*(1-x)**0.5, [0, -j, 1.5])
(0.152380952380952380952381 + 0.4023774302466306150757186j)


The sign of the imaginary part shows that betainc gives the equivalent of a contour through the lower half plane. The convention turns out to agree with that used by Mathematica:


In[10]:= Beta[0, 1.5, 3, 1.5]
Out[10]= 0.152381 + 0.402377 I


I'll round things up by noting that I've also implemented the generalized exponential integral (the En-function) in mpmath as expint(n,z). A sample:


>>> print expint(2, 3.5)
0.005801893920899125522331056
>>> print quad(lambda t: exp(-3.5*t)/t**2, [1,inf])
0.005801893920899125522331056


The En-function is based on the incomplete gamma function, which is based on the hypergeometric series 1F1. These functions are still slow and/or inaccurate for certain arguments (in particular, for large ones), so they will require improvements along the lines of those for 2F1. Stay tuned for progress.

In other news, mpmath 0.12 should be in both SymPy and Sage soon. With this announcement I'm just looking for an excuse to tag this post with both 'sympy' and 'sage' so it will show up on both Planet SymPy and Planet Sage :-) Posts purely about mpmath development should be relevant to both audiences though, I hope.

by Fredrik Johansson (fredrik.johansson@gmail.com) at June 11, 2009 10:34 PM

June 09, 2009

Fredrik Johansson

Mpmath 0.12 released

I'll quote the mailing list announcement:

Mpmath version 0.12 is now available from the website:
http://code.google.com/p/mpmath/

It can also be downloaded from the Python Package Index:
http://pypi.python.org/pypi/mpmath/0.12

Mpmath is a pure-Python library for arbitrary-precision floating-point arithmetic that implements an extensive set of mathematical functions. It can be used as a standalone library or via SymPy (http://code.google.com/p/sympy/).

Version 0.12 mostly contains bug fixes and speed improvements. New features include various special functions from analytic number theory, Newton's method as an option for root-finding, and more versatile printing of intervals. It is now also possible to create multiple working contexts each with its own precision. Finally, mpmath now recognizes being installed in Sage and will automatically wrap Sage's fast integer arithmetic if available.

For more details, see the changelog:
http://mpmath.googlecode.com/svn/tags/0.12/CHANGES

Bug reports and other comments are welcome at the issue tracker at
http://code.google.com/p/mpmath/issues/list or the mpmath mailing list:
http://groups.google.com/group/mpmath


I've previously blogged about some of the new features:

Computing generalized Bell numbers
Approximate prime counting
Fun with zeta functions

See those posts or the documentation for many examples.

by Fredrik Johansson (fredrik.johansson@gmail.com) at June 09, 2009 10:30 PM

June 08, 2009

Dale Peterson

Dyad and Inertia classes implemented

In order to be able to specify the inertia of a rigid body, I implemented the Dyad and Inertia classes. The Dyad class works with the UnitVector class and supports left and right dot multiplication by any other UnitVector or Vector expression (assuming the relative orientations have been specified). The Inertia class is subclassed from Dyad, and allows the user to specify the reference frame, along with the 6 inertia scalars that fully define the mass distribution of a rigid body.

An example of use for the Dyad class would be:

>>> from sympy import *
>>> from pydy import *
>>> t = Symbol(‘t’)
>>> q1 = Function(‘q1′)(t)
>>> A = ReferenceFrame(‘A’)
>>> B = A.rotate(‘B’, 1, q1)
>>> var(‘a b c’)
(a, b, c)
>>> a_Dyad = Dyad(a*A[1]*B[2] + b*A[3]*B[1] + c*A[2]*B[3])
>>> print a_Dyad
Dyad(a·a₁·b₂ + b·a₃·b₁ + c·a₂·b₃)

Most of the time in classical mechanics, dyads are used to keep track of the inertia of rigid bodies, so I also implemented an inertia class:

>>> I_dyad = Inertia(A, (1, 2, 3, 4, 5, 6))
print I_dyad
Dyad(a₁**2 + 2·a₂**2 + 3·a₃**2 + 4·a₁·a₂ + 4·a₂·a₁ + 5·a₂·a₃ + 5·a₃·a₂ + 6·a₁·a₃ + 6·a₃·a₁)

I need to polish the display of the dyad and inertia classes, as they currently don’t display quite right (a₁**2 should be a₁*a₁), but the mathematics is there and behaves as it should:

>>> print dot(A[1], a_Dyad)
a·b₂
>>> print dot(a_Dyad, A[1])
b·b₁

In addition to these classes and class methods, I also implemented a couple of convenience functions: InertiaForce() and InertiaTorque(), which simply form -m*a and -dot(alpha, I) - cross(omega, dot(I, omega)), respectively.

These new functions will be incorporated into the frame work that will allow for the derivation of the equations of motion for a system of particles and rigid bodies.

My finals will be done by Tuesday, I’m looking forward to some serious work on PyDy once exams are over :)

by admin at June 08, 2009 06:23 AM

June 03, 2009

Fabian Pedregosa

Assumption system and automatic theorem proving. Should I be learning LISP ?

This is the third time I attempt to write the assumption system. Other attempts could be described as me following the rule:

“For any complex problem, there is always a solution that is simple, clear, and wrong.”

My first attempt (although better than the current assumption system) did use very rudimentary logic and was not very smart. It could infer some basic rules, like Integer => Rational, but could not construct long paths like Prime => Integer => Rational => Real => Complex. You would have to specify by hand that prime numbers are also complex … and this just what makes the old assumption system unmanageable.

I know that state-of-the art CAS like Mathematica use advanced resolution techniques to get smart results. And in the open source world, well, the only CAS that I could find that has this sort of algorithms in Maxima, but in order to understand that I would have to learn LISP, and ironically enough I started to contribute to SymPy because I didn’t feel like learning LISP …

by fabian at June 03, 2009 11:50 AM

June 02, 2009

Dale Peterson

Progress on trigsimp(), .eval() method of sin, cos, tan, and PyDy printing

A few things I’ve been working on in the last week have been the .eval() methods for the sin, cos, and tan classes inside of sympy.functions.elementary.trigonometric. Each of these class methods control how something like

>>> sin(x + pi/2)

is evaluated. Ideally, this would nicely evaluate to cos(x), by default. Similar results are desirable for cos and tan. This is the behavior that Mathematica has. So far, I have implemented this behavior into the sin and cos functions, and should have tan done in the next few days. A few snags along the way have been dealing with the case when some of the arguments are Function or Derivative types, rather than Symbol, but I think I have figured this out. This will be a nice improvement of the default Sympy behavior and will make implementing trigsimp() easier.

Once the behavior of Table 1 of the paper by Fu et al. (see last post) has been implemented into these .eval() methods of sin, cos, tan, my next task will be to implement the rule based algorithm for trigsimp(). The first step will be to be able to compare the length of two trigonometric expressions, so I will be writing a function that will act as a metric for comparing trig expressions. Once this is done, one can compare a trig expression before and after a trig identity has been applied to it, and if the new expression is smaller, according to the metric, then we can accept it, otherwise we reject it. The algorithm is very customizable because you can try any sequence of trig identities to see if one approach works better than another, although hopefully the sequences of trig identities in the paper will prove to be sufficient for most problems.

Other developments on the PyDy front include adding a subs method to the Vector class, for the purpose of replacing time derivatives of the generalized coordinates with the expressions involving the generalized speeds. This was a simple matter of looping through the expressions corresponding to each of the UnitVector keys in the dictionary that is used to represent each Vector quantity.

The PyDy printer has been customized and now it by default will display sin(q1) as s1, cos(q1) as c1, etc. This really makes things more readable. I had a lot of issues getting unicode to work properly, but now the multiplication ‘*’ symbol is being displayed with a nice small unicode dot, and the display looks a lot cleaner:

(-r1·q5′ - r1·q3′·s4 - r2·q5′·c4)·b₁ + (r1·q4′ + r2·q4′·c4)·b₂ - r2·q4′·s4·b₃

The above output is the velocity of the mass center of the rolling torus example that I am using to test many of the functions I implement in PyDy. What isn’t visible above is that the b₁, b₂, and b₃ symbols are in bold and bright white, because they are unit vectors, so they stand out make the output easy to read. Another idea I have been toying with is to make the time derivative of the generalized coordinates a different color, and similarly with the time derivatives of the generalized speeds. Additionally I have customized the printer to display derivatives with an apostrophe after them, which is a lot easier to read and takes up a lot less space than D(q1(t), t), which was the default Sympy behavior. What would be really cool was if there was a Unicode dot syntax, like \dot{} in LaTeX, then things would be really small. Creating a custom LaTeX printer is on my list of things to do this summer, so that all of the equations that are need to include in a paper can be generated in a LaTeX form that is consistent with the syntax used in the dynamics literature.

by admin at June 02, 2009 10:21 PM

June 01, 2009

Priit Laes

Heatwave has hit the town

Yesterday was so far the hottest day of the year topping near 28 degrees Celsius. It was also the first day of the exam session and I already got my feet wet with "Structure of Matter I". Unfortunately this isn't the only one - I still have exams in Theoretical Physics I, Mathematical Physics III and the hardest of them all - Optics.

So, what about Summer of Code and my project Partial Differential Equation support for Sympy?

I certainly haven't jumped the ship and continue working on it as soon as possible. The first task for me is to define an acceptable syntax for defining and solving PDE problems (with all the hints, assumptions, initial values and boundary conditions).

(After staring this text for 15 minutes and trying to figure out what shall I write here about my plans, I finally have to conclude that it has been a long hot day...)

As a sidenote, I did set up an application for reviewing patches. Hopefully it eases the management of the patch queue and improves overall code quality. (I really find google groups' lack of inline diff support quite unhelpful :( )

PS. You might need root certificate from Cacert, otherwise you may run into issues accessing the site.

by plaes at June 01, 2009 11:38 PM

May 31, 2009

Fabian Pedregosa

Fun with the new Logic module

The logic module is slowly becoming useful. This week I managed to get some basic inference in propositional logic working. This should be enough for the assumption sysmtem (although having first-order inference would be cool).

You can pull from my branch:

git pull http://fseoane.net/git/sympy.git logic

Here are some examples of what it can do:

First, importing and defining our symbols

In [1]: A, B, C = symbols('ABC')
In [2]: from sympy.logic import *

It works with Symbols just as you would expect

In [3]: And(A, B)
Out[3]: And(A, B)

It applies De Morgan Rules automatically

In [4]: Not(Or(A, B))
Out[4]: And(Not(A), Not(B))

converts to conjuntive normal form (CNF)

In [5]: to_cnf(Implies(A, And(B, C)))
Out[5]: And(Or(B, Not(A)), Or(C, Not(A)))

Some basic inference:

In [6]: pl_true( Or(A, B), {A : True})  # what can we say about Or(A, B) if A is True ?
Out[6]: True

In [7]: pl_true ( And(A, B), {B: False}) # what is And(A, B) if B is False
Out[7]: False

To be discussed:
- I’m not sure if we should override &&, || on Symbol so that we can do A && B instead of And(A, B). If would make the code
cleaner, but also I don’t want to bloat Symbol any more. What do you think ?

I’m very proud of this in the sense that it is a nice clean module that will hopefully serve as the foundation of the new assumption
system.

by fabian at May 31, 2009 09:16 PM

Freddie Witherden

Font Hinting Comparison

Although it has taken me much longer than I expected here is the font hinting comparison I promised some weeks ago. The above image is a direct comparison between the four (auto) hinting options provided by FreeType.

The most striking thing is how poor the "Slight" and "Full" hinting options look. (Although they look similar closer inspection reveals them to have some subtle differences.) "None" and "Medium" both produce acceptable results, however. The most notable difference between the two is in the infinity symbol: notice how it is much more consistent with "Medium" hinting as opposed to "None."

So why does all of this matter? Well, rendering quality is extremely important, especially when bitmapped images are being produced -- a potentially common use case for the resulting Mathtex library. Therefore it is important to know exactly how the various rendering options affect the output in order to choose sensible defaults.

About the samples: these were all produced using the svn HEAD version of matplotlib using the Cairo backend (which when running under Linux uses FreeType + fontconfig, making it easy to change the hinting method).

by Freddie Witherden (noreply@blogger.com) at May 31, 2009 09:18 PM

May 27, 2009

Fredrik Johansson

Cython mpmath performance

I'm presently working on a Cython-based backend for mpmath, with the immediate goal to make mpmath competitive as a component of Sage. The Cython code currently depends on utility functions in the Sage library but should eventually be possible to compile for standalone use or together with SymPy's future Cython backend (linking directly against GMP/MPIR).

So far I've implemented a real type (mpf replacement) and a complex type (mpc replacement); addition, multiplication, and the real exponential function; just enough to do some basic benchmarking. Below is a comparison between standard Python mpmath (running on top of sage.Integer), the new Cython-based types, and Sage's RealNumber / ComplexNumber types. I used the inputs x = sqrt(3)-1, y = sqrt(5)-1, and for the complex cases z = x+yi, w = y+xi. The precision ranges between 53 bits (IEEE double compatible) and 3333 bits (≈1000 decimal digits).


Addition, x+y

mpmath cython sage

53 8.72 µs 979 ns 707 ns
100 8.81 µs 1.01 µs 733 ns
333 10.1 µs 1.14 µs 737 ns
3333 10.6 µs 1.59 µs 1.15 µs

Multiplication, x*y

53 9.04 µs 968 ns 728 ns
100 9.56 µs 1.14 µs 901 ns
333 11.3 µs 1.59 µs 1.2 µs
3333 35.7 µs 25.4 µs 17.8 µs

Real exponential function, exp(x)

53 56.2 µs 9.41 µs 16 µs
100 52.8 µs 11.6 µs 11.3 µs
333 122 µs 25.4 µs 26.9 µs
3333 1.38 ms 831 µs 1.14 ms

Complex addition, z+w

53 14.7 µs 1.45 µs 978 ns
100 14.5 µs 1.7 µs 1 µs
333 16.3 µs 1.85 µs 992 ns
3333 17.7 µs 3.2 µs 1.73 µs

Complex multiplication, z*w

53 35.8 µs 2.72 µs 1.68 µs
100 34.5 µs 2.76 µs 2.08 µs
333 43.2 µs 4.38 µs 3.61 µs
3333 139 µs 98.6 µs 70 µs



A few observations can be made. First off, basic arithmetic with number instances is an order of magnitude faster when fully Cythonized. This isn't surprising because something as simple as an addition of two mpmath.mpf instances has to go through some 50 lines of Python code to check types, check for special cases, and round the result (creating lots of temporary objects, etc).

Reimplemented in Cython, arithmetic comes well within half the speed of Sage's numbers. I believe MPFR (which Sage uses) does arithmetic faster because it works directly with the limb data and uses some tricks to speed up the rounding, whereas my implementation uses the mpz interface straightforwardly. Some difference also comes from the fact that MPFR uses machine-precision integers for exponents, whereas I'm using mpz_t to allow arbitrary-size exponents. In any case, the results are very good.

At very low precision, memory allocations account for probably half the time (for both my Cython-based types and Sage's RealNumber). Thus there is some room for improvement later on by implementing a freelist.

The best news (for special functions) is that my exponential function is as fast as MPFR's, so the same should be true for other functions when I get there. The new exp (which uses a slightly different algorithm from the one currently in mpmath) is actually slightly faster than MPFR, although it should be said that performance for transcendental functions depends heavily on tuning, the inputs, and possibly other factors (I have no idea why Sage's RealNumber.exp in my benchmark is much slower at 53 bits than at 100 bits, for example), so this is not a conclusive result.

The code itself is currently too messy and ad-hoc to share publicly, sorry.

by Fredrik Johansson (fredrik.johansson@gmail.com) at May 27, 2009 03:17 PM

May 26, 2009

Dale Peterson

Body (Euler) fixed / Space fixed angles implemented in PyDy

The rotate() method of the Vector() class in PyDy now supports the 12 Euler angle rotations and the 12 space fixed rotations. Additionally, it automatically calculates the angular velocity of the new reference frame with respect to the parent frame. Here is how it works:

>>> t = Symbol(‘t’)
>>> q1 = Function(‘q1′)(t)
>>> q2 = Function(‘q2′)(t)
>>> q3 = Function(‘q3′)(t)
>>> A = ReferenceFrame(‘A’)
>>> B = A.rotate(‘B’, ‘BODY123′, (q1, q2, q3))
>>> dot(A[1], B[3])
sin(q2(t))
>>> dot(A[1], B[2])
-cos(q2(t))*sin(q3(t))
>>> dot(A[1], B[1])
cos(q2(t))*cos(q3(t))
>>> B = A.rotate(‘B’, ‘SPACE321′, (q1, q2, q3))
>>> dot(A[1], B[3])
sin(q2(t))
>>> dot(A[1], B[2])
-cos(q2(t))*sin(q1(t))
>>> dot(A[1], B[1])
cos(q1(t))*cos(q2(t))
>>>

The A[i], B[i] (i=1,2,3) notation indicates the i’th unit vector fixed in each of the reference frames, with the usual right hand rule applying: cross(A[1], A[2]) == A[3]

All 24 angle set conventions are implemented, and tests have been written to ensure they return the correct result. Rotations about an arbitrary axis, as well as the use of Rodrigues parameters or Euler parameters remain to be implemented, but they should be done in the next week.

This code has highlighted the need for some serious work to be done on Sympy’s trigsimp(). Especially when calculating the angular velocity vectors, things get messy quickly and unless the trigonometric simplification routines are up to the task, things will get unnecessarily messy. In the next week I will be working to implement the rule based algorithm presented by Fu et al.:

Automated and readable simplification of trigonometric expressions

Rigid body mechanics are notorious for long nasty trigonometric expressions, so implementing this will really help Sympy and how it deals with trigonometric expressions.

by admin at May 26, 2009 12:36 AM

May 22, 2009

Dale Peterson

PyDy / Sympy progress

Sympy 0.6.5 currently doesn’t allow for you to solve an expression, or differentiate with respect to, anything but a sympy Symbol.  This is problematic for PyDy because generally the generalized coordinates that are used are not just sympy Symbols, but instead are sympy Functions, implicitly dependent upon time.  So what I did was to implement the code that would allow solve to solve for not just Symbols, but also for Functions and Derivatives. So as a trivial example, the following code now works:

>>> from sympy import *
>>> x = Symbol(‘x’)
>>> solve(1-x, x)
[1]
>>> t = Symbol(‘t’)
>>> x = Function(‘x’)(t)
>>> solve(1-x, x)
[1]
>>> solve(1-x.diff(t), x.diff(t))
[1]
>>>

The patch that allows this behavior should be pushed into the main branch of sympy pretty soon, so if you clone the repo in the next few days, you should have this functionality as well.

Like I said, the example is trivial, but now, this will allow PyDy to solve the dynamic equations of motion (linear in the second time derivatives of the coordinates and/or first time derivatives of the generalized speeds). So as long as the linear solvers are up to the task of solving long, long, expressions, PyDy should be able to get the equations of motion in a form that will be directly ready for numerical integration, a key step to the task of generating animations. This is a good step forward.

The next task I’ll be tackling is to fix diff() so that it can differentiate with respect to objects other than the sympy Symbol, i.e., so that you can differentiate with respect to Functions and Derivative instances. This should be a very similar patch and should be done very soon.

Once that is done, I plan on spending some time on improving trigsimp(). Having this function work well will really help PyDy because with these long chains of rotations, the number of trigonometric terms becomes lengthy and having the most compact representation is ideal.

by admin at May 22, 2009 05:32 AM

May 16, 2009

Freddie Witherden

Is there anybody out there?

Sorry for the lack of updates over the past week or so. With exams on the horizon I am finding myself with ever less time to work on my summer of code project. Thankfully, however, I scheduled this into my initial time-line and so it should not be a problem.

I did, however, manage to find the time to work on a quick comparison between the hinting options provided by FreeType's auto-hinter and the current Mathtex implementation. Although not strictly related to my project it is interesting nevertheless. If all goes well I should have a post on it ready by Wednesday.

by Freddie Witherden (noreply@blogger.com) at May 16, 2009 11:37 PM

May 15, 2009

Fabian Pedregosa

Boolean algebra, first steps

The first task for my Summer of Code project is to write a nice boolean algebra module. I wanted a clean module that follows SymPy’s object model and that plays well with other objects. In practice this means that it should inherit from Basic and that it should behave well with python’s boolean values (True, False) as well as with unknown values (SymPy’s Symbol).

To implement boolean expressions, my first thought was to implement And, Or, Not as subclass of AssocOp, just as Add and Mul. At the beginning I had some problems with commutativity, since my expressions are commutative but AssocOp does not have this feature. No problem, I thought, I’ll just override .flatten(). That seemed to work until I wanted to implement double-negation rule for Not (Not(Not(x)) –> x). Maybe I could have used .flatten() also for that, but that was ugly. Overriding __new__ was even uglier, so I began to think for an alternative approach.

Then I tried to think as a mathematician: And and Or are functions: B^2 –> B, and Not: B –> B, where B is the boolean space, so why don’t I implement them as subclasses of Function?. Turned out that Function is very inheritance-friendly and all you have to do is override the .eval() method. This is a minimalistic And function:

class And(Function):
    @classmethod
    def eval(cls, *args):
        if all(map(lambda x: isinstance(x, bool), args)):
            return all(args)
        sargs = sorted(flatten(args, cls=cls))
        return Basic.__new__(cls, *sargs)

You can see full source code here Lot’s of features are missing, of course, but it is capable of doing some simplifications and it plays nicely with Symbol and boolean types.

by fabian at May 15, 2009 05:20 PM

May 10, 2009

Fabian Pedregosa

Logic modules in python

As a prerequisite of my GSOC project I have to do some modifications on sympy’s current logic module (see previous post), so I decided to go out and take a look at what others are doing in this area.

aima-python is a project that tries to implement all algorithms found in the excellent book AI: A Modern Approach (AIMA for short). Code is concise and well written, but is has some (solvable) incompatibilities with SymPy’s design. In this module, all logic expressions (no matter if it’s and, or, xor etc) are implemented as an Expr class, but in SymPy we prefer to have a base class and implement logic expressions (And, Or, etc.) as subclasses, as this has proven to be a good modularization technique sympy’s core objects (Add, Mul, etc.). The book is also my primary source of information, as I only learned very rudimentary logic at college.

Pylog is a very interesting project implementing a logic module (although it’s primary goal is to write a Prolog interpreter in Python). It’s logic module seems to be written in a fashion very similar to SymPy’s standards, with a term class and variables predicates deriving from this. Makes extensive use of python generators, something that is lacking in my Python knowledge (and shouldn’t!)

by fabian at May 10, 2009 11:15 PM

Ondřej Čertík

My experience with running an opensource project

Nir Aides, the author of the excellent winpdb debugger, sent me the following email on September 21, 2008, so I asked him if I can copy his email and reply in form of a blog post (so that other people can comment and join the discussion) and he agreed. It took me almost a year to reply, but I made it. :)

Hi Ondrej,

How are you?

I am about to publish a new free software project, a new simple PHP framework, and I am interested in your advice.

You started SymPy and were able to make other people join you and develop it with you.
How did you do it?
How did it happen?
Did you actively call for other people or they spontaneously showed interest and joined you?
Are the other major contributor people who were your friends before you started the project?
Did you need to create or manage the project in a particular way to make it attractive to other people?
Are there things you are aware of that promote collaboration or demote it?

I was never successful in doing the same with Winpdb, which while it became reasonably popular, no one has ever joined me to develop it, except for a notable tutorial contribution by Chris Lasher which was developed independently.

Now with the new project, I am wondering what are my chances of making other people try it and take it on. On the one hand it is a new and fresh code base in an interesting field, on the other hand, why would anyone bother to spend their energy on this new project when they have Symfony or Drupal?

What do you think?

BTW, Ohloh believes you have a median of 19,000 lines of changed code per month since the start of their log. Can this be true? Is this humanly possible? According to it SymPy has over 1,000,000 lines of code? I can't understand these numbers. Winpdb has about 25,000 lines after 3 years of development. And from my experience 1,000,000 lines of code projects need about 20-50 full time developers to work on for 2-5 years which is about 40-250 man years. And as if this is not enough you are listed as owner in a dozen other projects in Google code and have enough time to become an awarded scientist. How is this possible?

http://www.ohloh.net/p/sympy/contributors/

BTW2, do you still use Winpdb? If you find yourself using it less, can you say what are the reasons, or what it would take to make it more useful?

BTW3, How is SymPy doing?

Cheers,
Nir



So my most honest answer how to run a successful opensource project is: I don't know.

But nevertheless I tried to summarize some of my ideas and experience and some guidelines that I try to follow, maybe it will be useful to you Nir, or anyone else.

First of all, there has to be a public mailinglist (easily accessible), public bug tracker, nice webpage, easy to find downloads, frequent releases (once a month is good, but in the worst case at least 4 times a year) and a set of guidelines to follow in order to contribute. So that's a must, if the project doesn't have the above, it's almost impossible to become successful. However, that is just a start, just a playground. There are still many projects that have the above and yet they totally fail to attract developers.

So I think the most important principle is that I always think how to employ other people in what I do. If I have some plan in my head how to do something, e.g. how to move some things forward, I always create exact steps and put it to issues, or our mailinglist, so that each step can be done by someone who is completely new to sympy. So I try to look at things from other people's perspective and think -- ok, I quite like this SymPy project and I'd like to get this done (for example a new release, or something fixed, or implemented), but I have no idea how to start and what exactly needs to be done.

So what I try to do if someone comes to our list and asks for something, is that I create a new issue for it and think how I would fix it if I had time. Then write the necessary steps in the issue and invite the submitter to fix it and I offer help with explaining anything and guiding. Now there are two things that can happen. Either the submitter has time and a will to go forward and in this case he starts wrestling with it and whenever he has some code or a question, I need to find time, review it and offer some way out. Or the submitter is too busy, in which case the instructions simply rest in the issues and the next time someone asks for the feature, the instructions are already there. I don't have estimates how frequent either case is.

When I am working on something myself, I try not to code privately, but also put up issues first and put the steps needed in the issues, so that it's easy for other people to join in.

In general, the most precious value for me is the fact that someone else had to sit down at his computer and wrote the patch. So I do everything possible to get new (or more) people interested in the development. Some people think that only super programmers can do a decent job and it's useless to invest time in people that may just have started with Python. They are wrong. Among the SymPy developers (around 65 people total have contributed patches so far at the time of writing this post), we have all kinds of people. We have people from high school, we have a retired US army engineer, we have physicists, mathematicians, biologists, engineers, teachers, or just hobbyists, who do it for fun. Unfortunately, we do not have many women (I think no patch that made it into sympy was contributed by a woman, but I may be wrong), so if anyone has any ideas how to get more women involved, let me know (I know we have several women fans, so that's a good start:). We have people whose first open source project they ever contributed to was sympy and people who are new to Python.

Many times the first patch that a new potential developer submits is not perfect, usually it's faster for me to write it myself, than to help with the first patch, however my rule is to always help the submitter do that. Sometimes he sends a second patch, or a third, and usually it needs less and less work on my side and it already pays off, because he is then able to fix things himself, if he discovers a bug and sympy has just won a one more contributor.

So I came to the conclusion that all that is needed is an enthusiasm. You don't even have to know Python (as you can learn all these things on the way) and you can still do useful things for us and really spare our time.

To answer another question from Nir's email, SymPy has about 130000 lines of code and another about 20000 lines of tests, so I think those stats are wrong. Also the changed lines of code is in my opinion wrong, we usually have about 250 new patches per release (this depends how often we release and other things).

Yes, I am involved in couple other projects, e.g. Debian, Sage, ipython, scipy, hpfem.org (and couple more), basically everything that has to do with numeric simulation and Python, but my activity there varies. The most time consuming thing in the last couple years was definitely school, I was finishing my master in Theoretical Physics in Prague and then moved to the Nevada/Reno and I just finished my first semester here at PhD in Chemical Physics, and sometimes it was just crazy, e.g. I finished teaching at 7pm and instead of going home and sleep, I stayed in my office, fixed 10 sympy issues that were holding off a release, finished at 1am, went home (by bike, since I don't have a car yet), slept couple hours and then did just school again for a week, other people reviewed the issues in the meantime, and then I made the release (instead of sleeping again). In the last semester it was not unusual that I got home at 1am every week day, then slept most of Saturday to catch up, on Sunday I did some laundry and shopping, and the rest of time I did grading and homeworks for all my classes and teaching, no time for anything else (e.g. no friends, no girls, no rest, no hobby, no opensource stuff, nothing). So sometimes one has to work pretty hard to get through it, but fortunately it's behind me finally, if all goes well, I should be just doing research from now on and have a real life too. Also I am sorry I didn't manage to reply sooner. :)

To answer the other questions:
Are the other major contributor people who were your friends before you started the project?
No, not a single major contributor was my friend before I started the project. Every single one of them become a developer using the procedure I described above, e.g. first showed on the list or in the issues, and maybe even the very first patch was not a high quality one (and if I was stupid and arrogant, or didn't see the big potential, I would just ignore them). But when given a chance, they became extremely good developers and sympy would simply just not be here without them.

Did you actively call for other people or they spontaneously showed interest and joined you?
I very much encourage everyone to contribute, but the initial interest must be in them, e.g. they at least have to show around the mailinglist/issues, so that I know about them. But once I know they are interested in some issue, yes, I try to invite them to fix it, with my help.

One observation I made is that I have to always think in the spirit "how to earn new money, not how to spare the money I already have", e.g. when applied to sympy, how to get new developers, how to develop the new great things etc. Even if I am super busy as I was, I still have to think this way. Once I start thinking how to conserve and preserve what we already have, I am done, finished and that's the road to hell.

If I am open, positive, full of energy, I can see people joining me and we can do great things together. It probably sounds obvious, but it was not for me, when for example some people I worked with, started their own projects, when I got busy, and started to compete, instead of helping sympy out. And I felt betrayed, after so much work that I invested into it and started to become protective. And then I realised that's wrong. I can never stop other people do what they want to do. If they want to have their own project, they will have it. If they don't want to help sympy out, they won't (and what is more important, there is nothing wrong with either of that). It's that simple and being protective only makes things worse.

There is also a question of the license that you use for the project, e.g. one should basically only choose between BSD (maybe also MIT or Apache), LGPL and GPL (there are also several versions of the GPL licenses). Unfortunately the fact is, that there are people who will never contribute a code under a permissive BSD license (because it's not protecting their work enough) and there are also other people who really want to code to be BSD (or other permissive license) so they can sell it and they don't need to consult with lawyers what they are or aren't allowed to do and also so that they can combine it with any other code (opensource or not). It also depends if one wants to combine (and distribute) other codes together. So choosing a license is also important. I believe that for sympy BSD is the best and for other projects (like Sage) GPL is the best and one has to decide on a case by case basis. For Winpdb, I would make it BSD too, since you can get more people using it.

To conclude, SymPy is a little more than 2 years old, and it has been a great ride so far and more things are coming, e.g. this summer we have 5 Google Summer of Code students and people are starting it to use in their research and we plan to use it in our codes at our group here in Reno too, so things look promising. I am really glad, we managed to build such a community, so that when I am busy, as I was the last semester, other people help out with patches, reviews and other things, so that the project doesn't stall and when I got rid of my school duties now, we can move things forward a lot.

So maybe you can get inspired by some of the ideas above. I am also interested in any discussion about this (feel free to post a comment below, or send me an email, or just write to a sympy list about what you think).

by Ondřej Čertík (noreply@blogger.com) at May 10, 2009 01:47 PM

May 04, 2009

Freddie Witherden

Who are you?

Tell me, I really want to know, who are you? As I am sure many of you know from my last blog post my name is Freddie Witherden and I am a first year physics student at Imperial College London. While this is my first time participating in GSoC I have been active in the open source community for a couple of years now. In this post I'll to outline things of interest I've done heretofore — hopefully keeping the self aggrandisation to a minimum.

My first real experience as a contributor to an open source project came in 2007 when I started submitting patches to the open source real-time strategy game Warzone 2100. Later that year I rewrote the games network code and subsequently became a full fledged developer. Since then I have gone on to maintain the Mac version of the project and am slowly but surely modernising the user interface.

In around December of last year I wrote a web utility to perform error propagation calculations. Given a function, f of n linearly independent variables it would compute the error in f as a function of the errors in each of the input variables. The utility — which can be found here — was written in C++ using GiNaC for symbolic computation and jsMath for LaTeX math output. It was while designing this tool that I discovered the need for an easy way to render LaTeX math expressions without needing a full blown LaTeX install.

Around this time — while looking for alternatives to GiNaC that were more suited for web applications — I discovered Sympy. Since then I have contributed several patches to the LaTeX printing code in order to improve the output quality. But, the problem about how to render the resulting LaTeX code was still an open problem.

After Christmas my first year computing lab started which — among other things — consisted of a project. The project I chose was the double pendulum, a very simple example of a chaotic system. Already knowing C++ I went on ahead to write a graphical simulator using the Qt libraries. Released under the GPL it can be found here.

Although this post has not done a particularly good job at describing who I am I hope that some of the projects which I have had the privilege of being a part of are of some interest.

by Freddie Witherden (noreply@blogger.com) at May 04, 2009 04:32 PM

May 03, 2009

Freddie Witherden

Please allow me to introduce myself

…I'm a man of wealth and taste. My name is Freddie Witherden and I am a first year Physics student at Imperial College London. This summer I have the privilege of participating in Google's Summer of Code — working on externalising the Mathtex rendering engine which exists in Matplotlib into its own library.

Mathtex is a LaTeX rendering engine written purely in Python which is able to parse and render (to various formats) most mathematical expressions. While it is currently part of Matplotlib there are several projects and applications which would could make use of and benefit from it. Therefore I am working under the supervision of the Python Software Foundation to move the code into its own library.

Over the next couple of days — as time permits — I'll post some more information about the project, myself and what's involved.

by Freddie Witherden (noreply@blogger.com) at May 03, 2009 12:01 AM

April 24, 2009

Dale Peterson

April 23, 2009

Fabian Pedregosa

Google Summer of Code

First post on my GSOC adventure.

This year I got accepted in Google’s summer of code program as student, and my job will be to implement the assumptions framework in SymPy.

Although the project officially is only about implementing an assumptions framework, I have to prepare the ground before the official start of the program so that I can finish in time. Basically, this means refactoring the logic module, as this is the foundation of the assumption system. Current logic module was written with the old assumption system in mind, and has done a great job with that, but has some limitations:

- Non-standard API. From what I’ve read, everybody is calling the knowledge base KB, whereas we call it FactRules. Ask method is normally called Ask(), but we call it deduce_all_facts(), etc.

- Does not implement backward chaining, and I really need backward-chaining to make the assumption system a bit smart.

- No support for first-order logic, only propositional logic This is a must for anyone doing serious work in logic.

- Seems over complex. This is only an impression, but I’ve read other logic modules (aima-python), and they are much easier to understand.

Books I am reading on the topic:
- AI: A modern approach, Rusell & norvig
- Combinatorial Theory, Martin Aigner (has some useful info about graphs)

by fabian at April 23, 2009 09:53 PM

Dale Peterson

Greetings

PyDy is coming along nicely, with nearly all the vector calculus functionality implemented and nicely polished.  I am writing more tests to ensure that functions operating on vectors involving long kinematic chains are behaving as they should.

Currently, only simple rotations are permitted with PyDy.  This is the next thing I will be tackling, to add space fixed rotations, Euler parameters, Rodrigues parameters, and quaternions.  I may end up even using quaternions for the internal representation of the orientation of the reference frames, but I need to study this more and it would require a fairly major rework of the current code because currently orientations are stored by a standard 3×3 rotation matrix.

Finally, I am in the process of meeting with several professors here at UC Davis, and hopefully a few at Berkeley, Stanford, and a few people in industry to ask them what features they feel would really benefit PyDy.  I have my own ideas, but I want to find out what somebody with 20 years of experience in dynamics would have to say.  I am sure it will prove interesting.

by admin at April 23, 2009 05:13 PM

April 22, 2009

Priit Laes

ESTCube-1 press release...

As some of you might already know I am also working within a small group of students developing a small cubesat satellite. Last weekend we finally published a press release which basically concludes that our mission is actually feasible.

Now the University of Tartu together with Finnish Meteorological Institute and other European partners particularly from Estonia and Finland are joining forces to develop the first‐ever orbital test of the electric sail physical principle and technology. The ESTCube‐1 is a 1 kg nanosatellite and Estonia's first satellite, with planned launch in 2012.

And good news come in pairs - my project proposal Improve Sympy to handle Partial Differential Equations was accepted into Google Summer of Code. This means that I can basically spend my summer at Saaremaa and not worry about finding a summer job... :D

But more about the GSoC tomorrow...

by plaes at April 22, 2009 06:29 AM

April 20, 2009

Django-SymPy

local open source contest

This project is officially winner of the "Concurso universitario de software libre de Granada" (something like college open source software contest of Granada)



To celebrate it, we've made a little video showing the design process




Empathy project from fabian seoane on Vimeo.

And yes, I promise that I'll put this app online soon.

by Fabian Pedregosa (fabian.seoane@gmail.com) at April 20, 2009 03:18 PM

April 11, 2009

Priit Laes

Some sympy hacking

I went through my local sympy branches and cleaned up some smaller fixes in order to submit them upstream. I did not push the PDE stuff yet, as I feel it needs some more testing. So only four patches this time:

Added corner case for Bernoulli equation.
Small cleanups to the ODE solver and tests.
Fix typo: s/trascendental/transcendental
Fix path to aboutus.txt url.

And also a small note to myself how to publish and delete remote branch:

git push server local_branch:remote_branch
git push server :remote_branch

by plaes at April 11, 2009 08:20 PM

April 05, 2009

Official SymPy blog

SymPy 0.6.4 released

SymPy 0.6.4 has been released on April 4, 2009. It is available at

http://code.google.com/p/sympy/

About SymPy

SymPy is a Python library for symbolic mathematics. It aims to become a full-featured computer algebra system (CAS) while keeping the code as simple as possible in order to be comprehensible and easily extensible. SymPy is written entirely in Python and does not require any external libraries.

Major changes in this release:

  • robust and fast (still pure Python) multivariate factorization
  • sympy works with pickle protocol 2 (thus works in ipython parallel)
  • ./sympy test now uses our testing suite and it tests both regular tests and doctests
  • examples directory tidied up
  • more trigonometric simplifications
  • polynomial roots finding and factoring vastly improved
  • mpmath updated
  • many bugfixes (more than 200 patches since the last release)

The following 21 people have contributed patches to this release (sorted by the number of patches):

  • Ondrej Certik
  • Mateusz Paprocki
  • Fabian Seoane
  • Andy R. Terrel
  • Freddie Witherden
  • Robert Kern
  • Priit Laes
  • Riccardo Gori
  • Fredrik Johansson
  • Aaron Meurer
  • Alan Bromborsky
  • Brian E. Granger
  • Felix Kaiser
  • Kirill Smelkov
  • Vinzent Steinberg
  • Akshay Srinivasan
  • Andrew Docherty
  • Andrew Straw
  • Henrik Johansson
  • Kaifeng Zhu
  • Ted Horst

The following people helped review patches:

  • Riccardo Gori
  • Fabian Seoane
  • Vinzent Steinberg
  • Gael Varoquaux
  • Fredrik Johansson
  • Robert Kern
  • Alan Bromborsky
  • Ondrej Certik


There were 218 new patches since 0.6.3:

$ git log --pretty=oneline sympy-0.6.3..sympy-0.6.4 | wc -l
218

Plans for the future:

Our roadmap: http://wiki.sympy.org/wiki/Plan_for_SymPy_1.0

by Ondřej Čertík (noreply@blogger.com) at April 05, 2009 02:22 AM

March 25, 2009

Priit Laes

Partial differential equations with Sympy

Yesterday I bit the bullet and managed to add a simple solver for the separable variables PDE case into Sympy.

I have also set up a git repository viewer and my two patches can be viewed on the pde-upstream branch.

by plaes at March 25, 2009 05:49 AM

March 17, 2009

Ondřej Čertík

Newtonian Mechanics with SymPy

Luke Peterson from UC Davis came to visit me in Reno and we spent the last weekend hacking on the Python Dynamics package that uses SymPy to calculate equations of motion for basically any rigid body system.

On Friday we did some preliminary work, mostly on the paper, Luke showed me his rolling torus demo that he did with the proprietary autolev package. We set ourselves a goal to get this implemented in SymPy by the time Luke leaves and then we went to the Atlantis casino together with my boss Pavel and other guys from the Desert Research Institute and I had my favourite meal here, a big burger, fries and a beer.

On Saturday we started to code and had couple lines of the autolev torus script working. Then we went on the bike ride from Reno to California. I took some pictures with Luke's iphone:


Those mountains are in California and we went roughly to the snow line level and back:

This is Nevada side:


That was fun. Then we worked hard and by the evening we had a dot product and a cross product working, so we went to an Irish pub to have couple beers and I had my burger as usual.

On Sunday we spent the whole day and evening coding and we got the equations of motion working. On Monday we worked very hard again:



and fixed some remaining nasty bugs. I taught Luke to use git, so our code is at: http://github.com/hazelnusse/pydy, for the time being we call it pydy and after we polish everything, we'll probably put it into sympy/physics/pydy.py. If you run rollingtorus.py, you get this plot of the trajectory of the torus in a plane:

It's basically if you throw a coin on the table, e.g. this model takes into account moments of inertia, yaw (heading), lean, spin and the x-y motion in the plane. Depending on the initial conditions, you can get many different trajectories, e.g for example:

or:


This is very exciting, as the code is very short, and most of the things that Luke needs are needed for all the other applications of sympy, e.g. a good printing of equations and vectors (both in the terminal and in latex), C code generation, fast handling of expressions, nice ipython terminal for experimentation, plotting, etc.

Together with the atomic physics package that we started to develop with Brian sympy will soon be able to cover some basic areas of physics. Other areas are general relativity (there is some preliminary code in examples/advanced/relativity.py) and quantum field theory and Feynman diagrams - for that we need someone enthusiastic that needs this for his/her research --- if you are interested, drop me an email, you can come to Reno (or work remotely) and we can get it done.

My vision is that sympy should be able to handle all areas of physics, e.g. it needs good assumptions (if you want to help out, please help us test Fabian's patches here), then faster core, we have a pretty good optional Cython core here, so we'll be merging it after the new assumptions are in place. Then sympy should have basic modules for most areas in physics so that one can get started really quickly. From our experience so far in sympy/physics, those modules will not be big, as most of the functionality is not module specific.

by Ondřej Čertík (noreply@blogger.com) at March 17, 2009 04:32 PM

March 05, 2009

Ondřej Čertík

SIAM 2009 conference in Miami, part I

I am at the SIAM Conference on Computational Science and Engineering (CSE09) and it is awesome. Right now, I am writing from the 50th floor with Pearu Peterson (f2py), Brian Granger (ipython), Fernando Perez (ipython) and John Hunter (matplotlib), I took videos of them and with their permission, posted to youtube. The view from the balcony is spectacular. My own room used to be in the 15 floor in the Hilton hotel and I thought man, this is high, but then I visited Fernando and John in the 50th floor in their apartment and our 20 story hotel looks like a small hut.

As usual, I met lots of old friends and made some new ones. I liked the electronic structure section on Wednesday and a Python section today. I was also working very hard to get mayavi2 working in Sage to be ready for my presentation and it seems I just finally made it.

by Ondřej Čertík (noreply@blogger.com) at March 05, 2009 08:46 PM

March 04, 2009

Fredrik Johansson

Computing (generalized) Bell numbers

Over the last few days, I procrastinated a bit from more important tasks by implementing Bell numbers (and more generally, evaluation of Bell polynomials) in mpmath. The Bell polynomials are defined by B0(x) = 1 and



In particular, Bn(1) is the nth Bell number, which has combinatorial significance as the number of partitions of a set with n elements.

Instead of using the recurrence formula above, I implemented an approximate formula (discussed below) to provide fast approximate evaluation. So it is for example possible to do:

>>> from mpmath import *
>>> mp.dps = 30
>>> print bell(100000)
1.04339424254293899845402468388e+364471


This algorithms turns out to work well for exact evaluation at integer arguments, by setting the precision large enough, e.g. like this:

>>> from mpmath import *
>>> from mpmath.functions import funcwrapper
>>>
>>> @funcwrapper
... def bellexact(n,x=1):
... mp.prec = 20
... size = int(log(bell(n,x),2))+10
... mp.prec = size
... return int(bell(n,x)+0.5)
...
>>> bellexact(100)
47585391276764833658790768841387207826363669686825611466616334637559114
622672724044217756306953557882560751L
>>>
>>> bellexact(50)
185724268771078270438257767181908917499221852770L
>>> bellexact(40,40)
5288533501514377257070614176831982123270530650242136500078910824523240L


I did a bit of benchmarking for computing large Bell numbers Bn, and this approach appears to be much faster than what other systems use. Below are my timings for SymPy, GAP, Mathematica, and mpmath. (Note: as I do not have a personal Mathematica license, I ran it remotely on a different system, which based on a comparison of integer multiplication speed is 10-20% faster than my own laptop.)

n SymPy GAP Mathematica mpmath
100 0.007 s 0 s 0.005 s 0.012 s
1000 8.9 s 0.78 s 0.32 s 0.30 s
3000 - 67 s 12 s 3.6 s
10000 - - 396 s 76 s


From what I've read, Bell numbers have some interesting number-theoretical properties such as Touchard's congruence, so this could potentially be useful to someone. (Interestingly, I wonder if Touchard's congruence could be turned into an even faster modular algorithm for Bell numbers, similar to David Harvey's algorithm for Bernoulli numbers.)

The formula I implemented for the Bell polynomials is the infinite series



which is known by the name Dobinski's formula.

As a side note, I implemented this series using a piece of experimental new summation code designed to ensure full accuracy in spite of cancellation. For example, the following evaluation involves alternating terms of magnitude over 10400; the internal precision is automatically set to over 400 digits to provide the requested 15 digits:

>>> mp.dps = 15
>>> print bell(10,-1000)
9.55744142784509e+29


I'll probably be working more this summer on implementing similar methods for several mpmath functions that are currently implemented somewhat less robustly (more on that some other time).

One interesting aspect of Dobinski's formula is that it generalizes to arbitrary complex values for n, not just integers. I always prefer to implement functions for the most general possible arguments (and will sometimes hesitate to implement a function if I can only do it for a special case). Generalization is extremely useful: extending functions defined on the integers to the reals permits the use of differentiation, continuous optimization algorithms, etc. Extending further to the complex numbers allows for contour integration. Curiously, Mathematica does not seem to do this for its BellB function.

Unfortunately, there is a difficulty in implementing Dobinski's formula for general n: the 0th term is singular unless n is a nonnegative real number. Removing the term completely breaks compatibility with the standard definition of the Bell polynomials for n = 0. Keeping it either makes the series undefined, or at best renders the function discontinuous at the single point n = 0 which is certainly not desirable. As a workaround, I changed the formula to



The sinc term is continuous (analytic), and zero when n is an integer except when n = 0, so it solves all problems. The sinc function has no connection to Bell polynomials whatsoever as far as I know, but it is the simplest possible "patch" I can think of.

Voila, we now get a smooth transition between for example B0(x) = 0 and B1(x) = x:

>>> f = lambda k: (lambda x: bell(k,x))
>>> plot([f(k) for k in linspace(0,1,5)], [-2,3])




It also becomes possible to e.g. compute a Taylor series of B seen as a function of n, using either a step sum or complex integration for the derivatives:

>>> nprint(chop(taylor(lambda n: bell(n,1), 0, 5, method='step')))
[1.0, 0.222119, -0.504269, 3.32909e-2, 0.307601, 2.09191e-3]
>>> nprint(chop(taylor(lambda n: bell(n,1), 0, 5, method='quad')))
[1.0, 0.222119, -0.504269, 3.32909e-2, 0.307601]


We can for example use the Taylor expansion around n = 0 to evaluate the Bell polynomial with n = 1 (not really practical for anything, but a beautiful example of the power of analysis):

>>> print polyval(taylor(lambda n: bell(n,3), 0, 5)[::-1], 1)
2.9949024096858
>>> print polyval(taylor(lambda n: bell(n,3), 0, 10)[::-1], 1)
2.99998868106305
>>> print polyval(taylor(lambda n: bell(n,3), 0, 15)[::-1], 1)
2.99999998751065
>>> print polyval(taylor(lambda n: bell(n,3), 0, 25)[::-1], 1)
3.0
>>> print bell(1,3)
3.0


Finally, I also implemented the raw series



as a separate function, because it seems like it could be useful in its own right. Unfortunately, this function does not appear to have a standardized notation. For certain special values, it reduces to a Bell polynomial times an exponential function; an incomplete gamma function; or a generalized hypergeometric function. It is also related to the Poisson distribution. Other than that, I have not found much information about it. Does this function have a name? I named it polyexp in mpmath, because it can be seen as an exponential analog of the polylogarithm.

As a function of s, it is an L-series, and shows interesting, highly chaotic behavior in the complex plane:


>>> mp.dps = 5
>>> cplot(lambda s: polyexp(s,1), [-20,20],[0,60], points=50000)
>>> plot(lambda s: polyexp(s*j,1), [0,60])




by Fredrik Johansson (fredrik.johansson@gmail.com) at March 04, 2009 09:49 PM

February 21, 2009

Fredrik Johansson

How (not) to compute harmonic numbers

The nth harmonic number is the nth partial sum of the divergent harmonic series, Hn = 1 + 1/2 + 1/3 + ... + 1/n. The simplest way to compute this quantity is to add it directly the way it is written: 1, 1+1/2, 1+1/2+1/3, and so on. For n approximately greater than 10 or 100, this is algorithm is not a very good one. Why?

Firstly, in floating-point arithmetic, it is much better to use the asymptotic expansion



which requires fewer terms the larger n is. Combining this with recurrence when n is too small gives a fast numerical algorithm for fixed-precision or arbitrary-precision computation of harmonic numbers (especially large ones). It allows one to determine, for example, that it takes exactly 15092688622113788323693563264538101449859497 terms until the partial sums of the harmonic series exceed 100. (This algorithm is implemented in mpmath, and most other comparable software.)

Secondly, the naive algorithm is especially bad for exact computation of harmonic numbers. This might not be obvious at a glance. We are just adding n numbers: what's to improve? The catch is that rational arithmetic has hidden costs: a single addition requires three multiplications and usually a GCD reduction. In contrast with integer or floating-point addition, rational addition gets very slow when the numerators and denominators get large. Computing harmonic numbers the naive way triggers worst-case behavior of rational arithmetic: the denominators grow like n! (if no GCD is performed, the denominator of Hn literally becomes n!).

It should be well known that computing n! as 1, 1·2, 1·2·3, ... is a poor method when n is large. A much better algorithm, assuming that multiplication is subquadratic, is to recursively split the products in half, i.e. compute (1·2·...·n/2) · ((n/2+1)·...·n) and so on. This algorithm has complexity O(log(n) M(n log(n)) compared to O(n2 log(n)) for the naive algorithm, where M(n) is the complexity of multiplying two n-bit integers. This approach is called binary splitting, and has many other applications.

The example of factorials directly suggests an analogous algorithm for computing harmonic numbers with reduced complexity: just recursively split the summation in half.

A few more variations are also possible. Instead of working with rational numbers, which require a GCD reduction after each addition, one can work with the unreduced numerators and denominators as integers, and form a rational number at the end. Another way to obtain a pure-integer algorithm for Hn = p/q is to compute the denominator q = n! and then use the formula



for the numerator.

Below, I have implemented five different algorithms for computing harmonic numbers based on the preceding remarks. The code works with either Python 2.6, using the Fraction type from the standard library, or with gmpy, by adding the following respective header code:


# python
from math import factorial as fac
from fractions import Fraction as mpq
one = 1
mpz = int

# gmpy
from gmpy import mpz, mpq, fac
one = mpz(1)


Algorithm 1: directly add 1, 1/2, 1/3, ..., 1/n:

def harmonic1(n):
return sum(mpq(1,k) for k in xrange(1,n+1))


Algorithm 2: like algorithm 1, GCD reduction postponed to the end:

def harmonic2(n):
p, q = mpz(1), mpz(1)
for k in xrange(2,n+1):
p, q = p*k+q, k*q
return mpq(p,q)


Algorithm 3: compute denominator; compute numerator series:

def harmonic3(n):
q = fac(n)
p = sum(q//k for k in xrange(1,n+1))
return mpq(p,q)


Algorithm 4: binary splitting:

def _harmonic4(a, b):
if b-a == 1:
return mpq(1,a)
m = (a+b)//2
return _harmonic4(a,m) + _harmonic4(m,b)

def harmonic4(n):
return _harmonic4(1,n+1)


Algorithm 5: binary splitting, GCD reduction postponed to the end:

def _harmonic5(a, b):
if b-a == 1:
return one, mpz(a)
m = (a+b)//2
p, q = _harmonic5(a,m)
r, s = _harmonic5(m,b)
return p*s+q*r, q*s

def harmonic5(n):
return mpq(*_harmonic5(1,n+1))


I did some benchmarking on my laptop (32-bit, 1.6 GHz Intel Celeron M), with n up to 104 for the Python version and 106 for the gmpy version. Here are the results (times in seconds):


python:
n harmonic1 harmonic2 harmonic3 harmonic4 harmonic5
10 0.000159 0.000009 0.000010 0.000160 0.000025
100 0.004060 0.000233 0.000234 0.002211 0.000358
1000 0.889937 0.028822 0.029111 0.048106 0.026434
10000 769.109402 3.813702 3.823027 2.710409 3.475280

gmpy:
n harmonic1 harmonic2 harmonic3 harmonic4 harmonic5
10 0.000028 0.000010 0.000010 0.000033 0.000021
100 0.000284 0.000097 0.000111 0.000365 0.000226
1000 0.003543 0.001870 0.004986 0.004280 0.002651
10000 0.103110 0.142100 0.588723 0.059249 0.052265
100000 7.861546 17.002986 74.712460 1.115333 1.200865
1000000 994.548226 - - 27.927289 24.425421


Algorithm 1 clearly does not do well asymptotically. For the largest n measured, it is 284 times slower than the fastest algorithm with Python arithmetic and 40 times slower with gmpy arithmetic. Interestingly, algorithms 2 and 3 both substantially improve on algorithm 1 with Python arithmetic at least up to n = 10000, but both are worse than algorithm 1 with gmpy. This is presumably because GMP uses a much better GCD algorithm that reduces the relative cost of rational arithmetic.

Algorithms 4 and 5 are the clear winners for large n, but it is hard to tell which is better. The timings seem to fluctuate a bit when repeated, so small differences in the table above are unreliable. I think algorithm 5 could be optimized a bit if implemented in another language, by accumulating temporary values in machine integers.

It is amusing to note that computing the 10000th harmonic number with the recursive algorithm and an optimized implementation of rational arithmetic is 14,700 times faster than the direct algorithm with a non-optimized implementation of rational arithmetic. Choosing a larger n (say, 105) gives an even more sensational ratio, of course, which I'm not patient enough to try. The moral is: Moore's law is not an excuse for doing it wrong.

Algorithm 1 is not all bad, of course. With memoization (or implemented as a generator), it is good enough for sequential generation of the numbers H1, H2, H3, ..., and this tends to be needed much more commonly than isolated large harmonic numbers. SymPy therefore uses the memoized version of algorithm 1.

The algorithms above can be adapted for the task of computing generalized harmonic numbers, i.e. numbers of the form 1 + 1/2k + 1/3k + ... + 1/nk for some integer k. Also worth noting is that algorithm 5 allows for computation of Stirling numbers of the first kind. Since |S(n+1,2)| = n! · Hn, algorithm 5 can be viewed as an efficient algorithm for simultaneous computation of S(n,2) and n!. Expressing Stirling numbers in terms of generalized harmonic numbers, this extends to an algorithm for S(n,k) for any (large) n and (small) k.

A final remark: binary splitting is not the fastest algorithm for computing factorials. The fastest known method is to decompose n! into a product of prime powers and perform balanced multiplication-exponentiation. Can this idea be applied to harmonic numbers and/or Stirling numbers? It's not obvious to me how it would be done. Are there any other, possibly even faster algorithms for harmonic numbers? I've looked around, but been unable to find anything on the subject.

by Fredrik Johansson (fredrik.johansson@gmail.com) at February 21, 2009 05:04 PM

February 19, 2009

Fredrik Johansson

Python 3.1 twice as fast as 3.0

For large-integer arithmetic, that is. Mark Dickinson has beeen working on a patch that changes the internal representation of long integers from 15-bit to 30-bit digits, and there is an additional patch that adds a few algorithmic optimizations, written by Mark and Mario Pernici. Likely both patches will pass review and be included in Python 3.1. I asked Mark to run the mpmath unit tests to find out how big a difference the patches make, which he kindly did. Here are the results:

32-bit build:
unpatched (15-bit digits): 42.91 seconds
patched (30-bit digits): 40.57 seconds
patched+optimized: 30.15 seconds

64-bit build:
unpatched: 38.39 seconds
patched: 22.36 seconds
patched+optimized: 21.55 seconds
patched+optimized+bitlength: 20.20 seconds

The last result includes the use of the new int.bit_length() method (which I had a small part in implementing) instead of the pure-Python version in mpmath.

It's not surprising that the patches make high-precision arithmetic much faster, but most of the mpmath tests actually work at low precision and depend mostly on general speed of the Python interpreter. There the speedup is perhaps of order 0-30%. For the tests working at very high precision, the improvement is a factor 4 or 5 with the 64-bit build. Then there are a number of tests in between. With some imagination, all unit tests together provide a plausible estimate of actual performance for a wide range of applications.

An excerpt of before/after results for some particular tests, comparing 64-bit unpatched and 64-bit patched+optimized+bitlength:

# Only tests 53-bit precision
double_compatibility ok 1.3149240 s
double_compatibility ok 1.1979949 s

# Logarithms up to 10,000 digits
log_hp ok 4.0845711 s
log_hp ok 0.7967579 s

# Large Bernoulli numbers
bernoulli ok 6.8625491 s
bernoulli ok 1.4261260 s

# Gamma function, high precision
gamma_huge_2 ok 2.4907949 s
gamma_huge_2 ok 0.5781031 s

# Numerical quadrature, 15 to 100 digit precision
basic_integrals ok 3.0117619 s
basic_integrals ok 1.8687689 s


Mpmath does not yet run on Python 3.x; the benchmarking was made with a version of mpmath hacked slightly for the tests to pass. It would be nice to provide a 3.x compatible version of mpmath soon. The 2to3 tool fortunately handles almost all necessary patching; a handful of fairly simple additional changes need to be made. The most annoying problem is the lack of cmp (and particularly the cmp argument for sorted), which has no trivial workaround, but still should not be too hard to fix. In any case, it seems likely that the 30-bit patch will also be backported to Python 2.7, so most users should be able to take advantage of it.

It would be interesting to compare these benchmarks with the GMPY mode in mpmath. GMPY too provides a ~2x speedup for the unit tests. (This factor would be much larger if tests at even higher precision were included. At a million digits, GMPY is orders of magnitude faster than pure Python.) Originally GMPY only sped up the very high precision tests, and was otherwise slower than pure Python, probably due to Python's internal optimizations for int and long instances. This disadvantage was eliminated by implementing some helper functions for mpmath in C in GMPY. Mario Pernici has recently worked on further optimizations along the same lines, which should substantially improve low-precision performance.

by Fredrik Johansson (fredrik.johansson@gmail.com) at February 19, 2009 10:46 AM

February 17, 2009

Fredrik Johansson

Approximate prime counting

I'm continuing my quest for the special functions coverage in mpmath to match the coverage in Mathematica. Comparisons are not quite fair, because mpmath is (with a few exceptions) numerical-only, and it is missing some heavy-duty functions such as the Meijer G-function (and will for the foreseeable future). Many functions in mpmath are also implemented on smaller domains and less rigorously. On the other hand, Mathematica is more restricted in that it only supports machine-precision exponents. And most importantly, the Mathematica source code is very difficult to read for most people.

In any case, I'm trying to catch up with Mathematica 7, which added a whole slew of functions (not that I've caught up with Mathematica 6 or even 5). I did add a few functions in version 0.11 of mpmath as a consequence of seeing them in Mathematica 7 (as I recall right now, the Barnes G-function, hyperfactorial and generalized Stieltjes constants). See also the "Fun with zeta functions" post, which discussed the addition of the Riemann-Siegel functions in mpmath.

I just committed an implementation of the Riemann R function R(x) (also discovered in the "new in Mathematica 7" list), which is an analytic function that closely approximates the prime counting function π(x). The incredible accuracy of the Riemann R function approximation can be visualized by plotting it against the exact prime counting funtion (I added a naive implementation of π(x) as primepi, mainly to facilitate this kind of comparison -- SymPy contains a slightly more optimized primepi which could be used instead):

>>> from mpmath import *
>>> plot([primepi, riemannr], [0,100])




The accuracy for small x is not a fluke, either:

>>> plot([primepi, riemannr], [100000,105000])



The "classical" approximation based on the logarithmic integral is not nearly as good:

>>> plot([primepi, lambda x: li(x) - li(2)], [100000,105000])



The largest exact value of the prime counting function ever computed was π(1023). The Riemann R function is quite easy to evaluate for x far larger than that, and the relative accuracy only improves:

>>> mp.dps = 50
>>> print riemannr(10**1000)
4.3448325764011974554109300516481778421781159070188e+996

It is quite likely that no one will ever compute the exact integer that is π(101000), but the value shown above is correct to every indicated digit. In fact, R(101000) can be shown using simple estimates to be accurate to about 500 digits.

For the fun of it, I implemented a version of the prime counting function that quickly returns a strict bounding interval for π(x), using an inequality by Lowell Schoenfeld:



This gives for example:

>>> mp.dps = 15
>>> print primepi2(10**9)
[50823160.0, 50875310.0]


Comparing the Riemann R function and the exact prime counting function with the lower and upper bounds of the Schoenfeld inequality:

>>> pi1 = lambda x: primepi2(x).a
>>> pi2 = lambda x: primepi2(x).b
>>> plot([primepi, riemannr, pi1, pi2], [10000,12000])



Returning to the example x = 101000, the bounds obtained are:

>>> mp.dps = 1000
>>> v = primepi2(10**1000)
>>> print v.a
43448325764011974554109300516481778421781159070187535206418774067892028956297421
41091652903064607718165071467289250162191865437538175693583349276176983162304817
20028842283993044722261194636177370816661364951186507714125556442852887672847627
37639487556791903286839317671011195810576335127780015765296354832441209488958529
40270929343735968992442702495339548061518980068329967842108151105502256086735702
70840137536441416124939228421984939684274707774026258195286398483405141793241664
48771389205215847692233236114877473504187398786485377346384217825087356612222721
64076496151428324862762772509621406939237565543669854423809507199075669125082789
94753472799321385531846190616064670701763597463167501715697285784529582495283729
84726395093948427825392438044975721449167967563521761421630319216447872905300292
34804577579707557749789078595077525265053645288083485194663875002648919670801486
00507106899091429169697567568603093512170870024325220369457870653525185029701437
3844901850360974589491752273505884338.0
>>> print v.b
43448325764011974554109300516481778421781159070187535206418774067892028956297421
41091652903064607718165071467289250162191865437538175693583349276176983162304817
20028842283993044722261194636177370816661364951186507714125556442852887672847627
37639487556791903286839317671011195810576335127780015765296354832441209488958529
40270929343735968992442702495339548061518980068329967842108151105502256086735702
70840137536441416124939228421984939684274707774026258195286398483405141793241664
48771389205216030926132955971812691701044333600154515219652233647424144532724506
59125521224590219319931601764655406966919866611954863306151010025269888571358859
44632455086296390810620688247562865809602361895967001311288611164573841496810109
47459439386009668941560164405255365722936716670915493742153084879238459437035932
04613524632469949425914711588697691782286908198073032336714125348156636382686152
39874926303969403597018131954944580668821382841922209788091197293915488520205360
8277109394003350202108994842650400209.0
>>> nprint((v.delta)/v.a)
4.21728e-495
>>> nprint((riemannr(10**1000)-v.a)/v.a)
2.10863e-495


and so it turns out that "accurate to about 500 digits" can be replaced by "accurate to at least 494 digits".

Actually, this is not quite true. Schoenfeld's inequality depends on the truth of the Riemann hypothesis. Thus, it is conceivable that primepi2 might return an erroneous interval. If you find an output from primepi2 that you can prove to be wrong, this must mean either that 1) you've found a bug in mpmath, or 2) you've disproved the Riemann hypothesis. (One of these options is more likely than the other, and reports should therefore be submitted to the mpmath issue tracker rather than Annals of Mathematics.)

On a final note, I've also added the function zetazero for computing the nth nontrivial zero of the Riemann zeta function:

>>> mp.dps = 50
>>> print zetazero(1)
(0.5 + 14.134725141734693790457251983562470270784257115699j)
>>> print zetazero(2)
(0.5 + 21.022039638771554992628479593896902777334340524903j)
>>> print zetazero(10)
(0.5 + 49.773832477672302181916784678563724057723178299677j)
>>> nprint(zeta(zetazero(10)))
(2.14411e-50 - 4.15422e-50j)

It just uses a table to determine correct initial values for the rootfinder, so zetazero currently only works up to n = 100. In fact, it works up to n = 100,000 on an internet-enabled computer; zetazero will automatically try to download the table of 100,000 zeros from Andrew Odlyzko's website if called with n > 100. Or it can load a custom url/file if specified by the user. This is one reason why I love Python: downloading a data file, parsing it and converting it to an appropriate internal representation in virtually a single line of code, easily done in the middle of some algorithmic code.

More information about the Riemann R function can be found in "The encoding of the prime distribution by the zeta zeros", part of Matthew R. Watkins' amazing website. The moral of that document is that the error between the exact prime counting and the Riemann R function can be represented by a kind of "Fourier series" involving a summation over the zeros of the Riemann zeta function. I have not gotten to looking at that yet.

by Fredrik Johansson (fredrik.johansson@gmail.com) at February 17, 2009 06:37 PM

February 08, 2009

Django-SymPy

Design love and formula output

Today has been a day for doing some design tweaks with my partner Angel Soler, just to give empathy a more dynamic look:



It's been also very interesting discussing with Ondrej and other members of the SymPy community the best way to output formulas. Proposed solutions so far are:
  1. jsMath: this uses sympy's latex output and transforms it using javascript into something the browser can read (mathml or images, depending on the browser). Disadvantages: you have to load an entire javascript library, which takes quite a lot of time, and CPU goes up to 90% during 5 seconds.
  2. MathML: sympy supports mathml output natively and mathml is a universal language for displaying formulas. This would be the ideal solution, if it wasn't because of ... 1. Sympy outputs "Content" MathML, whereas firefox (and other browsers) knows only how to display another variant of MathML, "Presentation" MathML. You can transform Content MathML to Presentation MathML (and sympy knows how to do it), but it is an expensive opperation (maybe too expensive to be done on every request), and 2: MathML support in browsers is partial. To view MathML in IE you have to install a plugin (MathPlayer) which, by the way is not open source.
  3. Image. This would be a nice solution, and is the one chosen by wolfram's integrator as the main output. Sympy's png printer is a bit slow, since it does the following: expression -> latex -> dvi -> png.
  4. ASCII (pretty). This is the easiest solution. It renders ok (although i'm having some issues with wrapping), but is not as pretty as the image output.

by Fabian Pedregosa (fabian.seoane@gmail.com) at February 08, 2009 12:45 PM

February 04, 2009

Fredrik Johansson

Galerkin's method in SymPy

I'm currently taking a PDE course, and for this reason I am trying to come terms with the Galerkin method. The Galerkin method is conceptually simple: one chooses a basis (for example polynomials up to degree q, or piecewise linear functions) and assumes that the solution can be approximated as a linear combination of the basis functions. One then chooses the coefficients so as to make the error orthogonal to all basis functions; this amounts to computing a sequence of integrals (the inner products determining orthogonality) and then solving a linear system of equations.

Doing the calculations by hand can get very messy. Fortunately, it is straightforward to implement the Galerkin method e.g. for an ODE IVP in SymPy and have the ugly parts done automatically. The following function implements the global Galerkin method on an interval [x0, x1], using polynomials up to degree q as the basis. It is assumed that the 0th degree basis function is fixed to the initial value.


from sympy import *
sum = __builtins__.sum

def galerkin(ode, x, x0, x1, u0, q):
basis = [x**k for k in range(q+1)]
# Coefficients for the basis monomials
xi = [Symbol("xi_%i" % k) for k in range(q+1)]
# Solution function ansatz
u = u0 + sum(xi[k]*basis[k] for k in range(1,q+1))
# Form system of linear equations
equations = [integrate(ode(u)*basis[k], (x, x0, x1)) \
for k in range(1,q+1)]
coeffs = solve(equations, xi[1:])
return u.subs(coeffs)


In general, the integrals can get very complex, if not unsolvable, and one must fall back to numerical methods. But for a simple problem, SymPy can calculate the integrals and the symbolic solution provides insight. Solving the standard problem u' - u = 0 (with solution u(x) = exp(x)):


x = Symbol('x')
ode = lambda u: u.diff(x) - u
for q in range(1,6):
pprint(galerkin(ode, x, 0, 1, 1, q))


This outputs:

1 + 3*x
2
8*x 10*x
1 + --- + -----
11 11
2 3
30*x 45*x 35*x
1 + ---- + ----- + -----
29 116 116
2 3 4
1704*x 882*x 224*x 126*x
1 + ------ + ------ + ------ + ------
1709 1709 1709 1709
2 3 4 5
31745*x 15820*x 5460*x 1050*x 462*x
1 + ------- + -------- + ------- + ------- + ------
31739 31739 31739 31739 31739


One can see that the coefficients rapidly approach those of Maclaurin polynomials for exp. The fast global convergence is also readily apparent upon visualization:


from mpmath import plot
u1 = Lambda(x, galerkin(ode, x, 0, 1, 1, 1))
u2 = Lambda(x, galerkin(ode, x, 0, 1, 1, 2))
u3 = Lambda(x, galerkin(ode, x, 0, 1, 1, 3))
plot([exp, u1, u2, u3], [0, 2])




The blue curve is exp(x). Note that the plotted interval is larger than the solution interval, so already the q = 3 solution (purple) is essentially accurate to within graphical tolerance. With moderately high degree, one gets a very accurate solution:


>>> galerkin(ode, x, 0, 1, 1, 10).subs(x,'7/10').evalf()
2.01375270747023
>>> exp('7/10').evalf()
2.01375270747048


The Galerkin method might prove useful as an alternative algorithm for high-precision ODE solving in mpmath. Mpmath has no problem with the nonlinear integrations, nor with solving the linear system in case it gets ill-conditioned at higher degree.

I'll perhaps post some more code on this topic in the future, e.g. for solving the ODE with trigonometric and piecewise polynomial basis functions.

by Fredrik Johansson (fredrik.johansson@gmail.com) at February 04, 2009 07:28 PM

Fun with zeta functions

In mpmath-trunk there is now an implementation of the Riemann-Siegel Z function as well as the related Riemann-Siegel theta function. There is also a function for computing Gram points.

A picture is worth a thousand words. The Z function is a kind of real-valued version of the Riemann zeta function on the critical strip (compare to this previous plot):

>>> from mpmath import *
>>> plot(siegelz, [0,80])



The implementation is entirely generic, so it works in the complex numbers:

>>> cplot(siegelz, points=50000)



>>> cplot(siegelz, [-25,25], [-25,25], points=50000)



Gram points are useful for locating zeros of the zeta/Z functions. Huge Gram points themselves are easily located:

>>> mp.dps = 50
>>> print grampoint(10**10)
3293531632.7283354545611526800803306343201329980271


Unfortunately, evaluation of the Riemann zeta or Riemann-Siegel Z functions is not feasible for such large inputs with the present zeta function implementation in mpmath. Lowering expectations a bit, one can still compute some fairly large zeros:

>>> g1 = grampoint(1000)
>>> g2 = grampoint(1001)
>>> r = findroot(siegelz, [g1, g2], solver='illinois')
>>> print r
1421.8505671870486539107068075509847506037846486061
>>> print g1 r g2
True
>>> nprint(siegelz(r))
-1.30567e-51

(I chose the Illinois solver here because it combines bracketing with the fast convergence of the secant method.)

A plot of the initial couple of Gram points:

>>> mp.dps = 15
>>> plot(lambda x: grampoint(floor(x)), [0, 30])



The following visualization of Gram points is perhaps more illustrative. It is usually the case that the Gram points and sign changes (i.e. zeros) of Z alternate with each other:


>>> gs = [grampoint(n) for n in range(25)]
>>> def marker(x):
... for g in gs:
... if abs(g-x) 0.2:
... return 1
... return 0
...
>>> plot([siegelz, marker], [0,50])




Put another way, it is usually the case that (-1)n Z(gn) is positive (this is Gram's law):

>>> (-1)**10 * siegelz(grampoint(10)) > 0
True
>>> (-1)**11 * siegelz(grampoint(11)) > 0
True
>>> (-1)**12 * siegelz(grampoint(12)) > 0
True


The first exception occurs at n = 126 (more exceptions are listed in OEIS A114856). The Gram point is very close to a root:

>>> (-1)**126 * siegelz(grampoint(126)) > 0
False
>>> mp.dps = 50
>>> print grampoint(126)
282.4547208234621746108397940690599354048302480008
>>> print findroot(siegelz, grampoint(126))
282.46511476505209623302720118650102420550683628035
>>> print siegelz(grampoint(126))
-0.02762949885719993669875120344077657187547768743854


Now, it takes only a few lines of code to enumerate exceptions to Gram's law, up to say n = 2000:


>>> mp.dps = 10
>>> for k in range(2000):
... g = grampoint(k)
... s = siegelz(g)
... if (-1)**k * s 0:
... print k, g, s
...
126 282.4547208 -0.02762949482
134 295.583907 -0.01690039157
195 391.4482021 0.0232894207
211 415.60146 0.3828895679
232 446.8057559 -0.1410432574
254 478.9568293 -0.0600779492
288 527.6973416 -0.6654588176
367 637.320354 0.1579920088
377 650.8910448 0.8376010676
379 653.5978317 0.217352745
397 677.8523216 0.1342801302
400 681.8765522 -0.06717029681
461 762.6678783 0.1525747623
507 822.4194896 0.7419389942
518 836.5739092 -0.3982959071
529 850.6795334 0.1756176097
567 899.0502587 0.8296209263
578 912.9534756 -0.3537611108
595 934.3572317 0.121967239
618 963.1605748 -0.04025206217
626 973.1388241 -0.1578260824
637 986.8259207 0.2852904149
654 1007.905352 -0.5547383392
668 1025.199826 -0.2608153866
692 1054.715528 -0.07836895313
694 1057.167832 -0.1696327251
703 1068.189532 1.004550445
715 1082.850818 0.1608682601
728 1098.690513 -0.6875362205
766 1144.741881 -0.1678183794
777 1158.005614 9.646277741e-3
793 1177.246581 0.7011668936
795 1179.647457 0.3577242914
807 1194.033228 0.7072513907
819 1208.386043 0.2285739236
848 1242.939633 -0.6161899199
857 1253.626041 0.2131228775
869 1267.84791 1.271208219
887 1289.124631 0.1830773844
964 1379.419269 -1.392200147
992 1411.979146 -0.06783933868
995 1415.459427 0.2623456354
1016 1439.777518 -0.9324891715
1028 1453.639612 -0.3106139874
1034 1460.561547 -0.6501820215
1043 1470.933184 0.0987483831
1046 1474.387414 -0.5680195111
1071 1503.115534 0.1530310857
1086 1520.304266 -0.106037306
1094 1529.457103 -0.1839654392
1111 1548.873932 0.02102960258
1113 1551.155349 0.8003510427
1135 1576.211048 0.2276419942
1156 1600.060766 -0.02925581253
1165 1610.262397 8.575848564e-3
1178 1624.977566 -0.2655885013
1207 1657.717899 0.8645603918
1209 1659.971552 0.36625445
1231 1684.725787 0.6156424059
1250 1706.052177 -0.8285966558
1263 1720.616514 0.8288043998
1276 1735.158918 -0.02734148029
1290 1750.795762 -0.1953269334
1294 1755.258868 -0.2675606503
1307 1769.750095 0.5163671897
1319 1783.107961 0.09216880716
1329 1794.225992 0.3012117179
1331 1796.448134 0.1319941485
1342 1808.661254 -0.1258506495
1344 1810.880254 -0.03061304567
1402 1875.026023 -0.5758575812
1430 1905.854681 -1.184795762
1443 1920.138276 1.294805086
1456 1934.403327 -0.03436404474
1485 1966.159589 0.3192688646
1487 1968.346369 0.864509178
1495 1977.089273 0.1463922251
1498 1980.366127 -0.01464596441
1513 1996.736314 0.2570944738
1517 2001.097757 0.8247075447
1532 2017.438527 -0.1691755392
1543 2029.407189 0.1807441973
1545 2031.581995 1.207382688
1600 2091.233527 -0.03995365882
1613 2105.289905 0.4695228803
1620 2112.852035 -0.3058765971
1643 2137.666434 0.8692760726
1646 2140.899441 -0.2679836804
1656 2151.670095 -0.4251702923
1669 2165.658156 0.9247164091
1672 2168.883971 -0.01601726917
1684 2181.779042 -0.1420385576
1702 2201.097281 -1.274939957
1704 2203.241961 -0.2540991172
1722 2222.528116 -0.3804543179
1744 2246.06145 -0.1531821977
1747 2249.267282 1.366292187
1760 2263.150267 -0.7349734815
1773 2277.0188 0.4817831199
1787 2291.938132 0.06919712065
1796 2301.520437 -0.1687993299
1804 2310.032371 -0.687368415
1816 2322.790332 -0.1377354318
1819 2325.977969 0.1645731507
1843 2351.452601 0.01004227096
1850 2358.873909 -0.10089997
1856 2365.231898 -0.1952129943
1863 2372.645911 1.120344703
1876 2386.404453 -0.1397386302
1892 2403.31974 -0.6220070934
1902 2413.881627 -0.367836015
1921 2433.927875 0.01043043105
1933 2446.574386 1.413878285
1935 2448.681071 1.881639813
1953 2467.627613 0.05088637184
1969 2484.448555 0.2965372534
1982 2498.101553 -0.3858252138


(Note: the Z values are not quite accurate to the indicated precision, due to the magnitude of the Gram points. Increasing the precision, one finds siegelz(grampoint(1982)) = -0.385825235416...)

For those interested in this topic, there is a very nice book called "Riemann's Zeta Function" by H. M. Edwards. There is just as much information to be found on the web, e.g. on the amazing "Numbers, constants and computation" site by Xavier Gourdon and Pascal Sebah.

by Fredrik Johansson (fredrik.johansson@gmail.com) at February 04, 2009 03:57 PM

Django-SymPy

Empathy, first application using django-sympy

Today I committed the first web app that uses sympy-django. My friend Angel Soler did the design and called it Empathy, and it is supposed to be something similar to Wolfram's Integrator , but much more capable (not only integrals, but potentially any method implemented in SymPy In the following days I will finish the app and hopefully it will soon be working at http://empathy.sympy.org.



As you can see, it has some nice Ajax-autocompleters (done with prototype and script.aculo.us). A lot remains still to be done, for example, export the formulas to png would be nice (SymPy already has this feature, only that it exports LaTeX to dvi and then uses dvi2png, which is too slow for this app ...), right now I am trying with PlasTeX



By the way, the app is also free source, you can check the code in the empathy/ directory

by Fabian Pedregosa (fabian.seoane@gmail.com) at February 04, 2009 02:56 PM

February 03, 2009

Django-SymPy

killing python threads, part II

Today I found a (nicer) way of killing python threads


def alarm(secs):
def wait(secs):
for n in xrange(timeout_secs):
time.sleep(1)
if signal_finished: break
else:
thread.interrupt_main()
thread.start_new_thread(wait, (secs,))

try:
alarm(timeout_secs)
exec code in context
signal_finished = True
except KeyboardInterrupt:
raise SafeEvalTimeoutException(timeout_secs)


This is taken from http://code.activestate.com/recipes/496746/, and it is nice because it uses that thread.interrupt_main() raises a KeyboardInterrupt to stop the main thread.

Related to the previous article, I think that this way is cleaner (it does not use undocumented functions like ctypes.pythonapi.PyThreadState_SetAsyncExc). On the other side it only can kill other than the main thread (so this post should have been called "killing python evaluation"), but for now this is enough for me.

by Fabian Pedregosa (fabian.seoane@gmail.com) at February 03, 2009 12:43 PM

January 30, 2009

Django-SymPy

Kill python threads

To ensure that a user does not make an arbitrary complex request (like those that can hang your server, like sympy.factorint(345234523452353453*234423545244534535)), my solution was to implement threads that would be killed if they had not finished before a specified time.

But there is a problem ... python threads can not be killed!

This turns out to be a long-lived discussion on the python community: http://bugs.python.org/issue221115

Fortunately I found a piece of code ( http://code.activestate.com/recipes/496960/ )that implement a .terminate() method for threads. Great!

by Fabian Pedregosa (fabian.seoane@gmail.com) at January 30, 2009 12:30 PM