

This blogs describes the 10th week of the program. Some of the highlights of this week are:


This blogs describes the week 9, the beginning week of the final phase. This week, I continued to work on the extension of Compound Distributions as well as completing the Matrix Distributions. Some of the highlights of this week are:
This blog post describes my progress in week 8, the last week of Phase 2. Since we already have a SISO transfer function object available for block diagram algebra, so now, I thought of adding a MIMO transfer function object in this control systems engineering package. The way it works...


This blog provides the brief description of last week of the second Phase i.e. week 8. Some of the key highlights of this week are:


Hi everyone :) Long time no see.
First of all, GOOD NEWS: My PR on adding a SISO transfer function object is finally merged. Yay!! A robust transfer function object is now available for Single Input Single Output (SISO) block diagram algebra.. Here is the documentation.
This week, I spent most of my time reading about algorithms to parametrize algebraic curves and surfaces.
In many cases, it is easiar to define a region using an implicit equation over its parametric form. For example, a sphere can be defined with the implict eqution x^{2} + y^{2} + z^{2} = 4. Its parametric equation although easy to determine are tedious to write.
To integrate over implicitly defined regions, it is necessary to determine its parametric representation. I found the report on conversion methods between parametric and implicit curves from Christopher M. Hoffmann very useful. It lists several algorithms for curves and surfaces of different nature.
Every plane parametric curve can be expressed as an implicit curve. Some, but not all implicit curves can be expressed as parametric curves. Similarly, we can state of algebraic surfaces. Every plane parametric surface can be expressed as an implicit surface. Some, but not all implicit surfaces can be expressed as parametric surfaces.
One of the algorithm to parametrize conics is given below:
The difficult part is to find a point on the conic. For the first implementation, I just iterated over a few points and selected the point which satisfied the equation. I need to implement an algorithm that can find a point on the curve.
The parametric equations determined by the algoithm are large and complex expressions. The vector_integrate
function in most cases is taking too much time to calculate the integral. I tried fixing this using simplification functions in SymPy like trigsimp and expand_trig. The problem still persist.
>>> ellipse = ImplicitRegion(x**2/4 + y**2/16  1, x, y)
>>> parametric_region_list(ellipse)
[ParametricRegion((8*tan(theta)/(tan(theta)**2 + 4), 4 + 8*tan(theta)**2/(tan(theta)**2 + 4)), (theta, 0, pi))]
>>> vector_integrate(2, ellipse)
### It gets stuck
The algorithm does not work for curves of higher algebraic degree like cubics.
A monoid is an algebraic curve of degree n that has a point of multiplicity n 1. All conics and singular cubics are monoids. Parametrizing a monoid is easy if the special point is known.
To parametrize a monoid, we need to determine a singular point of n  1 multiplicity. The singular points are those points on the curve where both partial derivatives vanish. A singular point (xo, yo) of a Curve C is said to be of multiplicity n if all the partial derivatives off to order n  1 vanish there. This paper describes an algorithm to calculate such a point but the algorithm is not trivial.
My goal is to complete the work on parametrizing conics. If I can find a simple algorithm to determine the singular point of the required multiplicity, I will start working on it.
My goal for this week was to add support of integration over objects of geometry module.
In my GSoC proposal, I mentioned implementing classes to represent commonly used regions. It will allow a user to easily define regions without bothering about its parametric representation. SymPy’s geometry module has classes to represent commonly used geometric entities like line, circle, and parabola. Francesco told me it would be better to add support to use these classes instead. The integral takes place over the boundary of the geometric entity, not over the area or volume enclosed.
My first approach was to add a function parametric_region
to the classes of geometry module. Francesco suggested not to modify the geometry module as this would make it dependent on vector module. We decided to implement a function parametric_region
to return a ParametricRegion
of objects of geometry module.
I learned about the singledispatch decorater of python. It is used to create overloaded functions.
Polygons cannot be represented as a single ParametricRegion
object. Each side of a polygon has its separate parametric representation. To resolve this, we decided to return a list of ParametricRegion objects.
My next step was to modify vector_integrate
to support objects of geometry module. This was easy and involved calling the parametric_region_list
function and integrating each ParametricRegion
object.
The PR for the work has been merged. Note that we still do not have any way to create regions like disc and cone without their parametric representation. To support such regions, I think we need to add new classes to the geometry module.
I plan to work on implementing a class to represent implicitly defined regions.
The first phase of GSoC is over. We can now integrate the scalar or vector field over a parametrically defined region.
The PR for ParametricIntegral
class has been merged. In my last post, I mentioned about a weird error. It turns out that I was importing pi
as a symbol instead of as a number. Due to this, the expressions were returned unevaluated causing tests to fail.
I had a detailed discussion with Francesco about the route ahead. My plan has drifted from my proposal because we have decided not to implement new classes for special regions like circles and spheres. The geometry module already has classes to represent some of these regions. We have decided to use these classes.
Francesco reminded me that we have to complete the documentation of these changes. I was planning to complete the documentation in the last phase. But I will try completing it soon as I may forget some details.
We also discussed adding support to perform integral transformation using Stoke’s and Green’s theorem. In many cases, such transformation can be useful but adding them may be outside the scope of this project.
I have convinced Francesco on adding classes to represent implicitly defined regions. It might be the most difficult part of the project but hopefully, it will be useful to SymPy users.
My next week’s goal is to make a PR for adding suppor to integrate over objects of the geometry module.


Key highlights
of this week’s work are:
Fixed incorrect limit evaluation related to LambertW function
_singularities
feature to the LambertW
function so that its limit gets evaluated using the meromorphic check
already present in the limit codebase.
Fixed errors in assumptions when rewriting RisingFactorial / FallingFactorial as gamma or factorial
gamma
or factorial
methods of RisingFactorial
and FallingFactorial
did not handle all the possible cases, which caused errors in some evaluations.
Thus, we decided to come up with a proper rewrite using Piecewise
which accordingly returned the correct rewrite depending on the assumptions on the variables.
Handling such rewrites using Piecewise
is never easy, and thus there were a lot of failing testcases.
After spending a lot of time debugging and fixing each failing testcase, we were finally able to merge this.This marks the end of Phase2
of the program. I learnt a lot during these two months and gained many important things from my mentors.


This blog describes the 7th week of the program and the 3rd week of Phase 2. Some of the key highlights on the discussions and the implementations during this week are:


Key highlights
of this week’s work are:
Improved the limit evaluations of Power objects
This PR improves the limit evaluations of Power objects.
We first check if the limit expression is a Power object
and then accordingly evaluate the limit depending on different cases.
First of all, we express b**e
in the form of exp(e*log(b))
. After this, we check if e*log(b)
is meromorphic
and accordingly evaluate the final result.
This check helps us to handle the trivial cases in the beginning itself.
Now, if e*log(b)
is not meromorphic, then we separately evaluate the limit of the base and the exponent.
This helps us to determine the indeterminant form
of the limit expression if present.
As we know, there are 3 indeterminate forms corresponding to power objects: 0**0
, oo**0
, and 1**oo
, which need to be handled carefully.
If there is no indeterminate form present, then no further evaluations are required. Otherwise, we handle all the three cases separately and correctly evaluate the final result.
We also added some code to improve the evaluation of limits having Abs()
expressions.
For every Abs()
term present in the limit expression, we replace it simply by its argument or the negative of its argument, depending on
whether the value of the limit of the argument is greater than zero or less than zero for the given limit variable.
Finally, we were able to merge this after resolving some failing testcases.
Fixed limit evaluations involving error functions
The incorrect limit evaluations of error functions
were mainly because the tractable
rewrite was wrong and did not handle all the possible cases.
For a proper rewrite, it was required that the limit variable be passed to the corresponding rewrite method.
This is because, to define a correct rewrite we had to evaluate the limit of the argument of the error function
, for the passed limit variable.
Thus, we added a default argument limitvar
to all the tractable rewrite
methods and resolved this issue.
While debugging, we also noticed that the _eval_as_leading_term
method of error function was wrong, hence it was also fixed.
Finally, we were able to merge this after resolving some failing testcases.


This blog describes the 6th week of the official program and the 2nd week of Phase 2. By the end of this week, Compound Distributions framework is ready as targeted and I would now focus on the Joint Distributions in the upcoming weeks of this Phase.
This blog post describes my progress in Week 5 of Google Summer of Code 2020!
I ended up Week 4 by adding unit tests and a rough draft for Series
and Parallel
classes. Now in this week, to complete the implementation, we decided to add another...


Key highlights
of this week’s work are:
Fixed RecursionError and Timeout in limit evaluations
The Recursion Errors
in limit evaluations were mainly due to the fact that the indeterminant form of 1**oo
was not handled accurately in the mrv()
function of the
Gruntz algorithm
. So, some minor changes were required to fix those.
The major issue was to handle those cases which were timing out. On deep digging, we identified that the
cancel()
function of polytools.py
was the reason. Thus, we decided to completely transform the cancel()
function to speed up its algorithm.
Now after this major modification, many testcases were failing as the cancel()
function plays an important role in simplifying evaluations and
is thus used at many places across the codebase. Therefore, a lot of time was spent in debugging and rectifying these testcases.
Finally we were able to merge this, enhancing the limit evaluation capabilities of SymPy
.


This blogs describes the week 5, the beginning week of the Phase 2. Phase 2 will be mostly focused on Compound Distributions which were stalled from 2018, and additions to Joint Distributions.
With this, the fourth week and phase 1 of GSoC 2020 is over. Here I will give you a brief summary of my progress this week.
The initial days were spent mostly on modifying unit tests for Series
and Parallel
classes which I added in the...
I spent this week working on the implementation of the ParametricIntegral
class.
When I was writing the test cases, I realized that the API of the ParametricRegion could be improved. Instead of passing limits as a dictionary, tuples can be used.
So I modified the API of the ParametricRegion class. The new API is closer to the API of the integral
class and more intuitive. I made a separate PR for this change to make it easy for reviewers.
Example of the new API:
ParametricRegion( ( x+y, x*y ), (x, 0, 2), (y, 0, 2))
As discussed in previous posts, we decided to not associate a coordinate system with the parametric region. Instead, we assume that the parametricregion is defined in the coordinate system of the field of which the integral is being calculated. We calculate the position vector and normal vector of the parametric region using the base scalars and vectors of the fields. This works fine for most cases. But when the field does not contain any base scalar or vector in its expression, we cannot extract the coordinate system from the field.
ParametricIntegral(150, circle)
# We cannot determine the coordinate system from the field.
# To calculate the line integral, we need to find the derivative of the position vector.
# This is not possible until we know the base vector of the field
To handle this situation, I assign a coordinate system C to the region. This does not affect the result in any way as the result is independent of it. It just allows the existing algorithm to work in this case.
Francesco suggested making separate classes based on the nature of the field: vector and scalar. I am open to this idea. But I think it will be more easy and intuitive for the users if they can use the same class to calculate the integral. I do not think they are different enough from a user’s perspective to have a separate interface.
Maybe we can have a function vectorintegrate
which returns the object of ParametricVectorIntegral
or ParametricScalarIntegral
depending on the nature of the field. This can work for other types of integrals too. Suppose we implement a class ImplicitIntegral
to calculate the integral over an implicit region. The vectorintegrate
function can then return an object of ImplicitIntegral
object by identifying the region is defined implicitly. I think this will be great. I will have more discussion with Francesco on this aspect.
When evaluating double integral, the result some times depend upon the order in which the integral is evaluated. If the bounds of one parameter u
depend on another parameter v
, we should integrate first with respect to u
and then v
.
For example, consider the problem of evaluating the area of the triangle.
T = ParametricRegion((x, y), (x, 0, 2), (y, 10  5*x))
The area of the triangle is 10 units and should be independent of the order parameters are specified at the time of object initialization. But the double integral depends on the order of integration.
>>> integrate(1, (x, 0, 2), (y, 10  5*x))
20  10*x
>>> integrate(1, (y, 0, 10  5*x), (x, 0, 2))
10
So parameters must be passed to integrate
in the correct order. To overcome this issue, we topologically sort the parameters. SymPy already had a function to perform topologically sort in its utilities module. I implemented a function that generates the graph and passes it to the topological_sort
function. This made my work easy.
Some integrals are taking too long to compute. When base scalars in the field are replaced by their parametric equivalents, the expression of the field becomes large. Further, the integrand is the dot product of the field with a vector or product of the field and the magnitude of the vector. The integrate function takes about 2030 seconds to calculate the integral. I think this behavior is due to the expression of integrand growing structurally despite it being simple.
For example,
>>>solidsphere = ParametricRegion((r*sin(phi)*cos(theta), r*sin(phi)*sin(theta), r*cos(phi)),\
(r, 0, 2), (theta, 0, 2*pi), (phi, 0, pi))
>>> ParametricIntegral(C.x**2 + C.y**2, solidsphere)
In this case, the parametric field when replaced with parametersr become r**2*sin(phi)**2*sin(theta)**2 + r**2*sin(phi)**2*cos(theta)**2
although it can be easily simplified to r**2*sin(phi)
.
SymPy has a function called simplify
. simplify
attempts to apply various methods to simplify an expression. When the integrand is simplified using it before passing to integrate, the result is returned almost immediately. Francesco rightly pointed out that simplify is computationally expensive and we can try to some specific simplification. I will look into it.
Some test cases are failing because of integrate function returning results in different forms. The results are mathematically equivalent but different in terms of structure. I found this strange. I do not think this has to do with hashing. I still have not figured out this problem.
Hopefully, I will complete the work on the ParamatricIntegral and get it merged. We can then start discussing about representing implicit regions.
Part of this week was spent in getting the PR for ParametricRegion
merged. I made some fixes to get the failing tests to pass. I also added the docstring.
I started working on the ParametricIntegral
class. ParametricIntegral
class returns the integral of a scalar or vector field over a region defined parametrically. It should be able to calculate line, surface, or volume integral depending on the nature of the parametric region. To identify the nature of the region, the dimensions
property needs to be added in the ParametricRegion
class.
I started with writing the test cases to avoid my previous mistake of working on the implementation before deciding the API. The API of the ParametricIntegral class:
ParametricIntegral(field, parametricregion)
Some examples:
>>> curve = ParametricRegion((3*t  2, t + 1), {t: (1, 2)})
>>> ParametricIntegral(C.x, curve)
5*sqrt(10)/2
>>> semisphere = ParametricRegion((theta, pi), (2*sin(phi)*cos(theta), 2*sin(phi)*sin(theta), 2*cos(phi)),\
{theta: (0, 2*pi), phi: (0, pi/2)})
>>> ParametricIntegral(C.z, semisphere)
8*pi
I initially thought to make it a subclass of Integral
as the ParametricIntegral
is a kind of integral. But there was no reuse of any code from Integral
in the ParametricIntegral
so decided against it.
A rough sketch of the algorithm to calculate integral is described below:
integrate
function to calculate the integral.integrate
function can evaluate the integral, return it otherwise return an object of the ParametricRegion
class.We also had discussion on whether swapping the upper and lower limit has any affect on the result of a vector integral. For multiple integrals, the sign of the result changes when the upper and lower bound of a parameter are interchanged. I do not think we came to any conclusion. But this won’t matter for integrals over parametric regions as the bounds are directly defined by the user.
I will start working on the implementation of the ParametricIntegral
class. I am also trying to get more involved in SymPy outside of my GSoC project. I hope I can become a contributing member of the community.


Key highlights
of this week’s work are:
Adds cdir parameter to handle series expansions on branch cuts
Finally, after spending almost 2 weeks on this, we were able to merge the PR, adding a very important functionality of series expansions
on the branch cuts
to the codebase.
Previously, either SymPy
raised some error or the series expansion was computed incorrectly, when the value in the input was on the branch cut. But now, for most of the functions, the expansion produced is correct.
Not only this, we added the cdir
parameter to leadterm
and as_leading_term
functions as well. We even extended the functionality a bit to the limits
module, so now,
limits
of values lying on the branch cuts of a function are also computed correctly in most cases.
We are planning to extend this functionality to all the remaining special functions
and wherever else possible to make the codebase even more robust
.


With this blog it completes the awesome month of statistical learning and coding, and the official Phase 1 of the Google Summer of Code2020.
In the starting days of Week 3, I made further changes in the class level docstrings for TransferFunction
after Jason and Nikhil reviewed the PR. Jason also mentioned that I had to add a new page in sympy docs which should include the purpose of adding a control...


Key highlights
of this week’s work are:
Fixed incorrect evaluation caused due to subfactorial in limit_seq expression
subfactorial
term present in an expression into an equivalent term expressed in
the form of factorial
or gamma
was added which helped resolve the issue.
Adds cdir parameter to handle series expansions on branch cuts
Currently, many functions in the codebase are unable to produce correct series expansions
on branch cuts
. As a result,
the limit evaluation takes place incorrectly for these functions when the limiting value lies on the branch cuts.
Thus, we have decided to come up with a parameter named cdir
which stands for complex direction
and it indicates the direction from which the series expansion is required, thus helping us
to produce the correct series expansion. Special care needs to be taken while handling series expansions on the branch points
.
Once we are finished with this work, it will be a major enhancement
to the limit evaluation and series expansion capabilities of SymPy
.
This marks the end of Phase1
of the program. I learnt a lot during this one month and gained many important things from my mentors.


This blog provides the brief description of the last week i.e week 4 of the Phase 1. Some of the key highlights on the discussions and the implementation during this week are described below:
The first week of GSoC has been completed. My goal for this week was to add ParametricRegion class. Many common regions have well known parametric representation. Calculation of Integrals on parametric regions is a common problem of vector calculus.
The PR for adding the ParametricRegion class is #19472.
An object of ParametricRegion class represents a region in space defined in terms of parameters. If a user wants to perform integration on a region, they have to create a ParametricRegion object. This object can then be passed to other classes to perform integration.
The implementation part was easy. The difficulty was in deciding the API. I spent much of the week discussing the API, But it was worth it. The ParametricRegion class will most likely be used by many users so it is important for it to be simple and intuitive. I think I should have first completed writing test cases and examples.
The final API of ParametricRegion class:
ParametricRegion(parameters_or_coordsys, definition, limits=None)
Some Examples:
p = ParametricRegion(t, (t, t**3), limits={t: (1, 2)})
halfdisc = ParametricRegion((r, theta), (r*cos(theta), r* sin(theta)), {r: (2, 2), theta: (0, pi)})
cylinder = ParametricRegion((r, theta, z), (cos(theta), r*sin(theta), C.z), {theta: (0, 2*pi), z: (0, 4)})
This argument specifies the parameters of the parametric region. When there are more than one parameter, the user has to pass them as tuple. When a CoordSys3d object is passed, its base scalars are used as parameters.
definiton tuple maps the base scalars to thier parametric representation. It is not necessary for evey base scalar to have a parametric definition.
A parametric representation is just defining base scalars in terms of some parameters. If the base scalars are not knowm, the region cannot be completely specified. My intial approach was to include an argument to specify the coordinate system.
Passing a CoordSys3D object can be non intuitive for users. In most cases, the base scalars of the parametric region are the same as that of vector/scalar field. We can avoid passing the CoordSys3D object if we assume that the base scalars of parametric region field are same.
p = ParametricRegion((r, theta), (r*cos(theta), r*sin(theta)), limits={r: (0,1), theta: (0, pi)})
Integral(C.x**2*C.y, p)
# C.x = r*cos(theta), C.y = r*sin(theta), C.z = C.z
Integral(R.x*R.y*R.z, p)
# R.x = r*cos(theta), R.y = r*sin(theta), R.z = R.z
There can be a situation where SymPy cannot determine the coordinate system of the field. In such situations, we can raise a ValueError. For example:
Integral(R.x*C.z, p)
Most parametric region have bounds on the parameters. Note that it is not necessary for every parameter to have an upper and lower bound.
There can be many methods for a user to define bounds of parametric region. We decied to use a dictionary which we named limits. The keys of the dictionary are parameters and the values are tuples (lower bound, upper bound). Another approach is to use a nested bound tuple. Every subtuple will represent the bounds of a parameter.
My next week’s task is to get started with ParametricIntegral class. This class will represent integral of a scalar/vector field on a parametric surface. I will first write the unit tests and examples.
Week 2 of the coding period is now over. It was a week full of learning and a lot of challenges. I had my first meeting with Nikhil and Ishan on June 4, between 5:00  6:00 p.m. where we discussed a few things regarding the implementation part for this...


Key highlights
of this week’s work are:
Fixed _eval_nseries() of Power
This was a long pending issue.
Previously, in the codebase the series expansion of b**e
was computed by breaking the code into different cases, depending on the
values of the exponent or if the exponent has a symbol etc. Moreover, there was code to handle specific cases, and
it was not written in a general way. As a result, the code was very long and it was difficult to debug it when some issue popped up.
Hence, it was very important to completely rewrite and cleanup Pow._eval_nseries()
, so that many issues get resolved and
it becomes easy to debug any further issues related to series expansions or limit evaluations.
Thus, we came up with a general algorithm
covering all the cases.
The series expansion of b**e
is computed as follows:
b
as f*(1 + g)
where f
is the leading term of b
. g
has order O(x**d)
where d
is strictly positive.b**e
= (f**e)*((1 + g)**e)
where, (1 + g)**e
is computed using the concept of binomial series
.The major challenge which we had to face was the fragile nature of the existing code of Pow._eval_nseries()
.
Changing the code even a bit resulted in many test failures, as this function plays an important role in both series expansions and limit evaluations.
At times, it became extremely difficult to debug the cause of failures because there were several other functions as well on which the code of this function depended. Not only this, fixing one failure caused several others to popup.
Ultimately, after a week of hardwork, we got everything working. After which, we further optimised the code ensuring that we are not compromising with efficiency. At last, this issue was resolved and we ended up adding an extremely optimised implementation of the function to the codebase.


This blog provides the brief overview of the week 3 of the Phase 1. Some of the key highlights on the discussions and the implementation during this week are described below:
This specific blog describes the first official week of the coding period. I actually started coding from May 21 by opening #19390. I decided to start off the coding period by adding TransferFunction
class… which was basically a class for representing the LTI (Linear, timeinvariant) systems in...


On May 4, I got an email from Google that my proposal with SymPy has been accepted! Seriously, that was the best day of my life. I have learned a lot from this open source community from the past 6 months, and I hope to continue that.
The community...


Key highlights
of this week’s work are:
This was a long pending issue.
Previously, in the codebase the series expansion of a product
was computed as the product of expansions of the factors
.
This approach was correct only when the leading term of each series
is a constant
but not in general.
For example, to compute the expansion of f(x)/x**10
at x = 0
to order O(x**10)
it is necessary to compute the series expansion
of the function f(x)
to order O(x**20)
and thus, computing till order O(x**10)
would not suffice.
The strategy we implemented to resolve this issue was:
n0
of the leading term of the product
as the sum of the orders
of the leading terms of the factors
.n  n0
terms of its series expansion (starting from its leading term of order n1
and ending at order n  n0 + n1
).truncating at terms of order n
).I enjoyed implementing all this because at every step we had to ensure that we are not compromising with the efficiency of the code.
Finally, this issue was resolved and we ended up adding an extremely optimised implementation of the function to the codebase.
Used is_meromorphic() function to speed up limit evaluations
In this PR, we made use of the is_meromorphic()
function of SymPy to speed up limit evaluations for certain type of cases.
A function is said to be meromorphic
at a point, if at that point the limit of the function exists but is infinite
.
In these cases, the value of the limit can usually be determined with the help of the series expansion of that function
and
thus, there is no need to invoke the Gruntz algorithm
.
While working on the implementation of this functionality, we required the leading term
of the series expansion of the function in the limit expression at the point at which the limit needs to be evaluated
.
But we came across a weird situation, where for some functions, we got Complex Infinity
as the leading term
.
Thus, we had to rectify the _eval_as_leading_term()
methods of these functions (done in a separate PR).
After resolving this issue, we succeeded in adding the required functionality and hence, increased the efficiency of the limit evaluation algorithm of SymPy.
Planet SymPy is made from the blogs of SymPy's contributors. The opinions it contains are those of the contributor. This site is powered by Rawdog and Rawdog RSS. Feed readers can read Planet SymPy with RSS, FOAF or OPML.
Leave a comment Cancel reply