

Thank you to Ondřej Čertík, Oscar Benjamin, Travis Oliphant, and Pamela Chestek for reviewing this blog post. Any errors in the post are my own.
On April 19, 2022, for approximately 11.5 hours, the documentation website for the open source SymPy project and the corresponding GitHub repository were taken down as a result of a DMCA takedown notice submitted by WorthIT Solutions on behalf of HackerRank. In this post, I’d like to explain what the DMCA is and how it works, detail exactly what happened with this incident, and go over my views on the incident and some of the lessons we’ve learned as a result.
The information in this section has been drawn from various sources, including:
The Digital Millennium Copyright Act (DMCA) is a US law that was passed in 1998. It has laid the groundwork for how copyright is enforced in the age of the internet. One of the DMCA provisions that is important to understand is the socalled “safe harbor” provision. The safe harbor provision, roughly speaking, makes it so that websites can avoid liability because they are simply hosting or transmitting copyrighted content on behalf of their endusers.
The safe harbor provision has been absolutely essential to the modern internet. Without it, a website like GitHub could not exist. Imagine if every time someone found their copyrighted content had been uploaded to GitHub without their permission, that they could sue GitHub for damages. If this were the case, GitHub could never operate. The way GitHub works today, it cannot possibly know if every single thing that is uploaded to it is done so legally. The safe harbor provision also makes modern social media websites like Facebook, Twitter, and YouTube possible.
However, in order for a website to have and maintain safe harbor status, it must follow certain practices as outlined by the DMCA. In particular, sites must manage the “noticeandtakedown process”. GitHub’s noticeandtakedown process is described in its DMCA takedown policy.
The noticeandtakedown process works like this: a provider (like GitHub) has a means for a copyright holder to submit a notice. In GitHub’s case, they provide an online form. This notice must include certain information, including the name and address of the submitter, identification of the copyrighted work, the material on the site that is infringing copyright, and a statement of good faith, signed under penalty of perjury, that the person submitting the notice owns the copyright and believes their rights have been infringed upon. GitHub’s form also requires the submitter to assert that they have taken fair use into consideration.
Once the notice is submitted and once the provider confirms that the above things are done, they must “expeditiously” remove public access to the content. The DMCA does not define “expeditiously”.
The owner of the accused content then has the opportunity to file a counternotice. GitHub’s counter notice procedure is described in its guide to submitting a DMCA counter notice. This gives the person whose content has been removed recourse if they believe the original notice was invalid. A counter notice contains a statement by the user that the content was removed “as a result of mistake or misidentification”, signed under penalty of perjury. When a provider receives a counter notice, they must forward it to the original complainant, and restore the challenged material after 1014 business days. At this point, if the complainant wishes to keep the material down, they must file a lawsuit against the alleged infringer.
By following these procedures as outlined by the DMCA, GitHub avoids liability to either party.
GitHub goes beyond what the DMCA law requires. All notices and counter notices that have ever been submitted to GitHub are published publicly in the github/dmca repository (with personal information redacted). GitHub has also historically sided with developers in high profile infringement cases, such as the youtubedl incident in 2020 (this incident involved a takedown relating to copyright circumvention, which is a different part of the DMCA law that I haven’t discussed here because it isn’t relevant to the notice that was sent to SymPy).
It’s important to understand, however, that GitHub’s hands are tied in many ways here. If they do not follow the notice and counter notice procedures exactly as outlined in the DMCA law, they risk losing their safe harbor status with the US Copyright Office. Were this to happen, it would be completely disastrous to GitHub as a business. Without safe harbor protection, GitHub could be liable for any copyrighted content that someone uploaded to its platform.
Finally, a technical note: when someone submits a takedown notice on a single part of a GitHub repository, GitHub’s response is to take down the entire repository. This is because it is technically impossible to remove only part of a repository due to the way git works. Completely scrubbing data from git history is possible, but it’s a destructive process that’s very disruptive to developers. Instead, GitHub requires the repository owners to scrub the data themselves, if they choose to, within one business day. Otherwise, they will take the entire repository down.
Below I provide a quick timeline of what happened. Times are in Mountain Daylight Time, which is where I live.
First, for background, the SymPy documentation website https://docs.sympy.org/ is hosted on GitHub Pages and mirrors the repository https://github.com/sympy/sympy_doc. It is common for websites to be hosted on GitHub Pages because it’s completely free and easy to set up for someone who is already used to using git and GitHub. This blog itself is hosted on GitHub Pages. The sympy_doc repository only contains the prebuilt HTML of SymPy’s documentation. The actual source code lives in the main SymPy repository. However, the DMCA claims were only ever made against the website, so the main SymPy source code repository was unaffected.
Friday, Apr 15, 4:26 PM: Admins of the sympy/sympy_doc GitHub repository (myself, Ondřej Čertík, and Oscar Benjamin) received an automated email from support@githubsupport.com informing us that a DMCA takedown notice was filed against the SymPy documentation. Specifically, the claim, which can now be found on the github/dmca repository, claimed that “Someone has illegally copied our client's technical exams questions and answers from their official website https://www.hackerrank.com/ and uploaded them on their platform without permission.” It specifically referenced the page https://docs.sympy.org/latest/modules/solvers/solvers.html. The notice was made on behalf of HackerRank by an individual from “WorthIT Solutions Pvt. Ltd.” The notice also stated “the infringing website is not willing to remove our client's work”, which came as a surprise to us since, at no point in time prior to receiving this notice had we received any communications from HackerRank or WorthIT Solutions.
The support email stated that if we do not remove the offending content within one business day, the repository, and consequently the docs website, will be taken down.
While it is not relevant to the legality of the situation, it is worth noting that this was on Good Friday and the upcoming Sunday, April 17, was Easter Sunday, which limited our ability to effectively respond over the weekend.
Friday, April 15  Tuesday, April 19: At first, we were not sure if the support email was legitimate or if it was just convincing spam. One thing that confused us was that the support email came from a domain, githubsupport.com, which did not appear to exist. Another issue is that although GitHub’s policy was to post all DMCA claims to the github/dmca repository, this particular claim had not yet been uploaded there. We reached out to GitHub support to ask if the email was legitimate, and they responded Monday, April 18, 3:18 PM that the email was indeed a legitimate email from GitHub Trust & Safety.
I had already reached out on Friday to NumFOCUS for assistance. NumFOCUS is a 501(c)(3) nonprofit organization that represents many open source scientific projects including SymPy. However, due to the limited time frame of the notice and the fact that it occurred over a holiday weekend, NumFOCUS was not able to connect us with legal counsel until Tuesday.
Tuesday, April 19, 12:21 PM: We received notice from GitHub that the sympy/sympy_doc repository has been taken offline. The sympy/sympy_doc repository was replaced with a notice that the documentation was taken down by the DMCA notice with a link to the notice, which had been posted to the public github/dmca repository, and the documentation website itself started to 404. Since this was now public knowledge, we publicly announced this on the SymPy mailing list and Twitter.
We also worked with NumFOCUS and the legal counsel to send a counter notice to GitHub, since we believed that the notice was mistaken and that the claimed infringement, the code and mathematical examples on the docs page, were likely not even copyrightable.
Not long after the takedown occurred, someone posted it to Hacker News. The Hacker News posting eventually made its way to the front page and by the end of the day it had made its way to rank 3 on the site, where it received hundreds of upvotes and comments. The story also generated a lot of buzz on Twitter at this time.
Tuesday, April 19, 6:40 PM: Vivek Ravisankar, the CEO of HackerRank, posted a public apology on the Hacker News thread:
Hello, I'm Vivek, founder/CEO of HackerRank. Our intention with this initiative is to takedown plagiarized code snippets or solutions to company assessments. This was definitely an unintended consequence. We are looking into it ASAP and also going to do an RCA to ensure this doesn't happen again.
Sorry, everyone!
Tuesday, April 19, 8:26 PM: Vivek posted a followup comment on Hacker News:
Hello again, Vivek, founder/CEO here. In the interest of moving swiftly, here are the actions we are going to take:
(1) We have withdrawn the DMCA notice for sympy; Sent a note to senior leadership in Github to act on this quickly.
(2) We have stopped the whole DMCA process for now and working on internal guidelines of what constitutes a real violation so that these kind of incidents don't happen. We are going to do this inhouse
(3) We are going to donate $25k to the sympy project.
As a company we take a lot of pride in helping developers and it sucks to see this. I'm extremely sorry for what happened here.
HackerRank did follow through with the $25k donation to SymPy.
Wednesday, April 20, 12:00 AM (approximately): The SymPy documentation website and the sympy/sympy_doc repository went back online.
Wednesday, April 20, 2022, 10:11 AM: We received an email notice from GitHub that the DMCA claim against our repository has been retracted. The retraction is made available online on the github/dmca repository.
Now that I’ve stated the facts, let me take a moment to give my own thoughts on this whole thing. My belief is that the DMCA claim that was made against the SymPy repository was completely baseless and without merit. Not only is no part of the SymPy documentation, to my knowledge, taken illegally from HackerRank or any other copyrighted source, but the claim itself was completely unspecific about which parts of the documentation were supposedly infringing. This made it impossible for us to comply even if the complaint was legitimate. In addition, it is questionable whether the supposed parts of the documentation that were taken from HackerRank, the mathematical and code examples, are even copyrightable. While this may not have been an actively targeted attack against SymPy, as a community run open source project, it served as one in practice.
We have never learned any more about which parts of the SymPy documentation were supposedly infringing on HackerRank’s copyright, and I expect we never will. Some people online have speculated that HackerRank actually took examples from the SymPy documentation, not the other way around, but I do not know if this is the case or not. It seems likely, given the large number of other takedown notices WorthIT has made to GitHub on HackerRank’s behalf, that they were using some sort of automated or semiautomated detection system, which somehow detected what it thought was infringing content on our website. I have personally never used the HackerRank platform, so I only know what sorts of things are on it from secondhand accounts.
Firstly let me say that if anyone or anything is to be blamed for this, my personal belief is that the blame should primarily lie on the DMCA law itself. While the “safe harbor” provisions of the law have been critical to the development of the modern internet, allowing social websites such as GitHub to exist without the burden of legal liability for user contributed content, other provisions are hostile to those same users, even those who are operating in a completely legal manner. The DMCA works in a “guilty until proven innocent” way, and the very operation of the takedown notice policy implicitly assumes that those making claims are not abusing the system as bad actors. It also incentivizes platforms such as GitHub to step aside and not take sides in claims, even claims such as this one, which refer to likely uncopyrightable material or are too unspecific to reasonably address even if they are legitimate.
The DMCA also has had further chilling effects due to its anticircumvention provisions. The EFF has written extensively about these. While something as radical as abolishing copyright may be considered to be practically untenable, I believe that laws like the DMCA can be rewritten so that they still serve their intended function of protecting copyright owners and providing “safe harbor” to websites like GitHub, while also protecting the rights of those accused of infringing copyright.
With that being said, I think some blame does need to lie on HackerRank and WorthIT Solutions for abusing the DMCA system and filing claims like this. They also lied in their claim when they said “the infringing website is not willing to remove our client's work.” At no point were we contacted by HackerRank or WorthIT Solutions about any copyright infringement. If we were, we would have been happy to work with them to determine whether any content in SymPy is infringing on copyright and remove it if it was. Even if it were the case that SymPy’s documentation infringed on HackerRank’s copyright, the immediate escalation to a DMCA takedown notice, which legally forced GitHub to completely disable the entire SymPy documentation, was completely inappropriate. Even submitting a counter notice is risky, because by doing so it increases the likelihood that the complaining party will bring an infringement lawsuit. Many DMCA’d repositories do not submit a counter notice for this reason, and we were only able to do so comfortably because we were able to do so via NumFOCUS. Had HackerRank not retracted the notice, we would have had to wait for our counter notice to take effect, meaning our docs website would have remained unavailable for 1014 days. And had they instead taken down the main SymPy repository instead of the sympy_doc repo where the docs are hosted, this would have been a major disruption for our entire development workflow.
I do want to thank Vivek for quickly retracting the notice once this came to his attention, and I hope that he will be rethinking their DMCA policies at HackerRank, and for their donation to support SymPy and NumFOCUS.
Finally, while many have blamed GitHub, I believe that for the most part, they have acted as they are effectively required to by law. It is important to understand that any similar website based in the US would have likely treated this claim in a similar way. For instance, here is GitLab’s DMCA policy, which you can see is very similar to GitHub’s. By all accounts, GitHub is an industry leader here, by posting all notices and counter notices online, which they are not required to do, and by being very open about their DMCA policies ([1], [2], [3]). With that being said, there are ways that GitHub’s processes could be improved here. The one business day that we were given to respond is actually standard in the industry, and ultimately comes from the “expeditiously” wording of the DMCA. But even so, if we were given more time than this, it would have made things much easier. It may have even been possible for us to reach out to HackerRank and resolve the situation before any actual takedown occurred. It also would be helpful if GitHub, while staying within the DMCA provisions, had a higher standard of legitimacy for takedown notices such as the one we received. As I have noted, the request we received was not nearly specific enough for us to effectively act on it, which should have been apparent to GitHub, since the only information WorthIT provided was a link to the HackerRank home page. Finally, I have two small technical requests from GitHub: first is to make githubsupport.com a real website so that support emails appear more legitimate, and second is to post DMCA notices to the DMCA repository right away rather than only after the repository is taken down, to make it clearer that the notices are in fact legitimate.
Open source communities such as SymPy are underresourced and typically ill equipped to handle situations like these, which inherently puts them in an inferior position. This is despite the fact that open source software serves a global commons that benefits all. I have estimated that SymPy itself is used by hundreds of thousands of people, and the broader scientific Python ecosystem that it is part of is used by millions of people. Abuses of copyright law against these communities are harmful to society as a whole.
I’d like to end with a few lessons that I’ve learned from this whole process.
I do feel that being on a site like GitHub is still preferable to something like selfhosting. If you selfhost, you become responsible for a ton of things which GitHub does for you, like making sure everything stays online, handling servers, and managing spam. Also, selfhosting does not magically shield you from legal threats. If you selfhost content that infringes on someone’s copyright, you are still legally liable for hosting that content.
Finally, I want to thank everyone who supported SymPy during this incident. Even if you just talked about this on social media or upvoted the Hacker News story, that helped us get this into the public eye, which led to a much faster resolution than we expected. I especially want to thank Leah Silen and Arliss Collins from NumFOCUS; Ondřej Čertík, Oscar Benjamin, and other SymPy community members; Pamela Chestek, the NumFOCUS legal counsel; and Travis Oliphant for the help they provided to us. Thank you to Thomas Dohmke, GitHub’s CEO, who we have reached out to privately, and who has promised to improve GitHub’s DMCA policies. I also want to thank Vivek Ravisankar from HackerRank for retracting the DMCA claim, and for the generous donation to SymPy.
Now that this incident is over, I’m hopeful we in the SymPy community can all go back to building software.
If you wish to donate to support SymPy, you may do so here.


I've ditched Disqus as the comment system on this blog. I am now using Utterances. Utterances is an open source comment system that is backed by GitHub issues. Basically, every post has a corresponding issue opened on GitHub, and the comments on the post are the comments on the issue. Utterances automatically places the comments at the bottom of the post. For example, here is the issue corresponding to this post.
I didn't like Disqus mostly because it serves ads and tracking. Even though I had opted out from as much of it as I could in the Disqus settings, it still loads tracking scripts on every page. I run uBlock Origin always, and it's a bit hypocritical if my own side has things that are blocked by it. In some cases I can't avoid it (as far as I know), like when I embed a YouTube video, but it definitely shouldn't be on every post.
Utterances is a very nice alternative. I has lots of advantages:
If you use Nikola like I do, it natively supports Utterances (a feature which I added). Otherwise, go to the Utterances and paste the script tag generated at the bottom into your blog template. Then install the Utterances app in your repo, and you are done.
Some of my old posts had Disqus comments, which I wanted to preserve somehow. Here is guide on how I did that, since it wasn't as straightforward as I would have hoped.
The first step is to export your Disqus comments. It's very difficult to actually find the place in the Disqus site where you do this, but I finally found the URL. The export takes some time to complete (for me it took about half an hour). When it finished, Disqus will email you an XML file with all your comments. Note that the file contains all comments for all sites you have ever set up with Disqus. For me, it also included all the comments on my old Wordpress blog, as well as posts for draft blog posts that I never ended up publishing. It also contained all comments that were marked as spam, so you will need to remember to filter those.
I decided that since I only have a handful of posts with Disqus comments, I would just write a script to process them all and manually print them out, which I will then manually enter in to the Utterances comment system for those posts.
I wrote a script to process the comments, which you can find here. Disqus does provides an XML schema for the XML. I used a library called xsData, which lets you take an XML scheme and generate Python dataclasses corresponding to it, which make manipulating the parsed XML much easier than the standard library xml library. The script outputs text like
========== Comments from https://asmeurer.github.io/blog/posts/whathappenswhenyoumesswithhashinginpython/ ==========
These are the original comments on this post that were made when this blog used the [Disqus blog system](https://www.asmeurer.com/blog/posts/switchingtoutterancescomments/).
>**Comment from bjd2385 on 20160828 12:33:12+00:00:**
><p>Very interesting post. I was just looking into hash functions (I've never formally learned what the topic entails), and since I'm most familiar with Python this post explained quite a bit, especially your early mathematical points.</p>
>**Comment from Mark Lawrence on 20161003 20:26:54+00:00:**
><p>At what point does Python 3 force the override of __hash__ if you've defined __eq__? E.g when would your</p><p>AlwaysEqual class fail?</p>
>**Replies:**
>>**Comment from asmeurer on 20161003 20:38:13+00:00:**
>><p>That's a good catch. I originally wrote this post in Python 2. The example does indeed fail in Python 3. More specifically, if you override __eq__, Python 3 automatically sets __hash__ to None. I'll update the post to make this more clear.</p>
>**Comment from Erick Mendonça on 20170730 03:23:55+00:00:**
><p>Great article! We must really pay attention to these details when implementing custom hashes.</p>
>**Comment from Ignacio on 20171007 22:31:56+00:00:**
><p>Thanks a lot for this post! Clarified a lot of concepts.</p>
which I then manually copied to each post's Utterances page on GitHub.
Feel free to adapt my script if you find yourself in a similar situation.
Feel free to use the comments on this page to play around with the commenting system.
Note that to comment, there are two options. You can log in on this page, which will let you type your comment in the box below. This requires giving the Utterances bot access to your GitHub account. Alternately, if you don't want to give a bot access, you can just go directly to the GitHub issue page and comment there. I am currently in the process of figuring out how to add some boilerplate to each page that makes this clear (see this Utterances issue). If anyone has any suggestions on how to do this, let me know. For now, I am just going to manually add a statement about this as the first comment on each post.


This summer I had plenty of time during COVID19 lockdown and I was looking at SymPy Gamma.
SymPy Gamma is a web application that executes mathematical expressions via natural language input from the user, after parsing them as SymPy expressions it displays the result with additional related computations. It is inspired from the idea of WolframAlpha which is based on the commercial Computer Algebra System named “Mathematica”.
I have always been impressed by it ever since I first found about it. While playing with it during this summer, I realised that it runs on Google App Engine’s Python 2.7 runtime. It is powered by SymPy, an open source computer algebra system.
Despite being widely used around the world (about ~14K users everyday, as seen from Google Analytics), there hasn’t been a lot of development in the past 5 years. Due to this the current infrastructure was stuck on Google App Engine’s Python 2 runtime which obviously does not support Python 3.
This also prevented it to use the latest version of SymPy. The SymPy version (~0.7.6) it was using was last updated 6 years ago. This made SymPy Gamma in urgent need for upgradation. At the time of writing this blog, SymPy Gamma is running on Google App Engine’s latest runtime and latest version of SymPy.
It was a fun project and was interesting to see how Google cloud offering has evolved from Google App Engine to Google Cloud Platform. The old App engine did seem like a minimal cloud offering launched by Google in an attempt to ship something early and quickly. It reminded me of my first cloud project in college (dturmscrap), which I deployed to Google App Engine, back in 2015.
I used Github projects to track the whole project, all the work done for this can be seen here.
Here is a summary of what was achieved:
PR 135: Migrating Django to a slightly higher version, this was the first blood just to make sure everything was working. I upgraded it to the latest version of Django that was supported on Python 2 runtime. This also exposed the broken CI, which was fixed in this.
PR 137: This upgraded the CI infrastructure to use Google Cloud SDK for deployment, the previous method was discontinued.
PR 140: Upgrading the Database backend to use Cloud NDB instead of the legacy App Engine NDB.
PR 148: Since we change the database backend, we needed something for testing locally, this was done by using Google Cloud Datastore emulator locally.
PR 149: The installation and setup of the project was quite a challenge. Installing and keeping track of the versions of a number of application was nontrivial. This Pull request dockerized the project and made the development setup trivial and it all boiled down to just one command.
PR 152: The login feature was previously implemented using the user API of the Google App Engine’s Python2 runtime, which was not available in Python 3 runtime. We removed the login feature as it was not used by many and was too much effort to setup OAuth for login.
PR 153: Now was the time to slowly move towards Python 3 by making the code compatible with both 2 and 3. It was achieved via pythonmodernize.
PR 154: We then made the migration to Python 3.7 runtime and removed submodules
and introduced a requirements.txt
for installing dependencies.
PR 159: The above change made it possible to upgrade SymPy to latest version, which was 1.6 at that time.
PR 165: The last piece of the puzzle was upgrading Django itself, so we upgraded it to the latest version, which was Django 3.0.8.
Thanks to Google for properly documenting the process, which made the transition much easier.
Thanks to NumFocus, without them this project would not have been possible. Also thanks to Ondrej Certik and Aaron Meurer for their advice and support throughout the project.
This report summarizes the work done in my GSoC 2020 project, Implementation of Vector Integration with SymPy. Blog posts with step by step development of the project are available at friyaz.github.io.
I am Faisal Riyaz and I have completed my third year of B.Tech in Computer Engineering student from Aligarh Muslim University.
The goal of the project is to add functions and class structure to support the integration of scalar and vector fields over curves, surfaces, and volumes. My mentors were Francesco Bonazzi (Primary) and Divyanshu Thakur.
Here is a list of PRs which were opened during the span of GSoC:
(Merged) #19472: Adds ParametricRegion
class to represent parametrically defined regions.
(Merged) #19539: Adds ParametricIntegral
class to represent integral of scalar or vector field over a parametrically defined region.
(Merged) #19580: Modified API of ParametricRegion
.
(Merged) #19650: Added support to integrate over objects of geometry module.
(Merged) #19681: Added ImplicitRegion
class to represent implicitly defined regions.
(Megred) #19807: Implemented the algorithm for rational parametrization of conics.
(Merged) #20000: Added API of newly added classes to sympy’s documentation.
(Open) #19883: Allow vector_integrate
to handle ImplicitRegion
objects.
(Open) #20021: Add usage examples.
We had an important discussion regarding our initial approach and possible problems on issue #19320
The ParametricRegion
class is used to represent a parametric region in space.
>>> from sympy.vector import CoordSys3D, ParametricRegion, vector_integrate
>>> from sympy.abc import r, theta, phi, t, x, y, z
>>> from sympy import pi, sin, cos, Eq
>>> C = CoordSys3D('C')
>>> circle = ParametricRegion((4*cos(theta), 4*sin(theta)), (theta, 0, 2*pi))
>>> disc = ParametricRegion((r*cos(theta), r*sin(theta)), (r, 0, 4), (theta, 0, 2*pi))
>>> box = ParametricRegion((x, y, z), (x, 0, 1), (y, 1, 1), (z, 0, 2))
>>> cone = ParametricRegion((r*cos(theta), r*sin(theta), r), (theta, 0, 2*pi), (r, 0, 3))
>>> vector_integrate(1, circle)
8*pi
>>> vector_integrate(C.x*C.x  8*C.y*C.x, disc)
64*pi
>>> vector_integrate(C.i + C.j, box)
(Integral(1, (x, 0, 1), (y, 1, 1), (z, 0, 2)))*C.i + (Integral(1, (x, 0, 1), (y, 1, 1), (z, 0, 2)))*C.j
>>> _.doit()
4*C.i + 4*C.j
>>> vector_integrate(C.x*C.i  C.z*C.j, cone)
9*pi
geometry
moduleMany regions like circle are commonly encountered in problems of vector calculus. the geometry module of SymPy already had classes for some of these regions. The functionparametric_region_list
was added to determine the parametric representation of such classes. vector_integrate
can directly integrate objects of Point, Curve, Ellipse, Segment and Polygon class of geometry
module.
>>> from sympy.geometry import Point, Segment, Polygon, Segment
>>> s = Segment(Point(4, 1, 9), Point(1, 5, 7))
>>> triangle = Polygon((0, 0), (1, 0), (1, 1))
>>> circle2 = Circle(Point(2, 3), 6)
>>> vector_integrate(6, s)
42
>>> vector_integrate(C.x*C.y, circle2)
72*pi
>>> vector_integrate(C.y*C.z*C.i + C.x*C.x*C.k, triangle)
C.z/2
In many cases, it is difficult to determine the parametric representation of a region. Instead, the region is defined using its implicit equation. To represent implicitly defined regions, we implemented the ImplicitRegion
class.
But to integrate over a region, we need to determine its parametric representation. the rationl_parametrization
method was added to ImplicitRegion
.
>>> from sympy.vector import ImplicitRegion
>>> circle3 = ImplicitRegion((x, y), (x4)**2 + (y+3)**2  16)
>>> parabola = ImplicitRegion((x, y), (y  1)**2  4*(x + 6))
>>> ellipse = ImplicitRegion((x, y), (x**2/4 + y**2/16  1))
>>> rect_hyperbola = ImplicitRegion((x, y), x*y  1)
>>> sphere = ImplicitRegion((x, y, z), Eq(x**2 + y**2 + z**2, 2*x))
>>> circle3.rational_parametrization()
(8*t/(t**2 + 1) + 4, 8*t**2/(t**2 + 1)  7)
>>> parabola.rational_parametrization()
(6 + 4/t**2, 1 + 4/t)
>>> rect_hyperbola.rational_parametrization(t)
(1 + (t + 1)/t, t)
>>> ellipse.rational_parametrization()
(t/(2*(t**2/16 + 1/4)), t**2/(2*(t**2/16 + 1/4))  4)
>>> sphere.rational_parametrization()
(2/(s**2 + t**2 + 1), 2*s/(s**2 + t**2 + 1), 2*t/(s**2 + t**2 + 1))
After PR #19883 gets merged, vector_integrate
can directly work upon ImplicitRegion
objects.
In vector calculus, many regions cannot be defined using a single implicit equation. Instead, inequality or a combination of the implicit equation and conditional equations is required. For Example, a disc can only be represented using an inequality ImplicitRegion(x**2 + y**2 < 9)
and a semicircle can be represented as ImplicitRegion(x**2 + y**2 4) & (y < 0))
. Support for such implicit regions needs to be added.
The parametrization obtained using the rational_parametrization
method in most cases is a large expression. SymPy’s Integral
is unable to work over such expressions. It takes too long and often does not returns a result. We need to either simplify the parametric equations obtained or fix Integral to handle them.
Adding the support to plot objects of ParamRegion
and ImplicitRegion
This summer has been a great learning experience. Writing a functionality, testing, debugging it has given me good experience in testdriven development. I think I realized many of the goals that I described in my proposal, though there were some major differences in the plan and actual implementation. Personally, I hoped I could have done better. I could not work much on the project during the last phase due to the start of the academic year.
I would like to thank Francesco for his valuable suggestions and being readily available for discussions. He was always very friendly and helpful. I would love to work more with him. S.Y Lee also provided useful suggestions.
Special thanks to Johann Mitteramskogler, Professor Sonia Perez Diaz, and Professor Sonia L.Rueda. They personally helped me in finding the algorithm for determining the rational parametrization of conics.
SymPy is an amazing project with a great community. I would mostly stick around after the GSoC period as well and continue contributing to SymPy, hopefully, exploring the other modules as well.


This report summarises the work I have done during GSoC 2020 for SymPy. The links to the PRs are in chronological order. For following the progress made during GSoC, see my blog for the weekly posts.
I am Sachin Agarwal, a thirdyear undergraduate student majoring in Computer Science and Engineering from the Indian Institute of Information Technology, Guwahati.
My project involved fixing and amending functions related to series expansions and limit evaluation. For further information, my proposal for the project can be referred.
This section describes the actual work done during the coding period in terms of merged PRs.
#19292 : This PR added a condition to limitinf
method of gruntz.py
resolving incorrect limit evaluations.
#19297 : This PR replaced xreplace()
with subs()
in rewrite
method of gruntz.py
resolving incorrect limit evaluations.
#19369 : This PR fixed _eval_nseries
method of mul.py
.
#19432 : This PR added a functionality to the doit
method of limits.py
which uses is_meromorphic
for limit evaluations.
#19461 : This PR corrected the _eval_as_leading_term
method of tan
and sec
functions.
#19508 : This PR fixed _eval_nseries
method of power.py
.
#19515 : This PR added _eval_rewrite_as_factorial
and _eval_rewrite_as_gamma
methods to class subfactorial
.
#19555 : This PR added cdir
parameter to handle series expansions
on branch cuts
.
#19646 : This PR rectified the mrv
method of gruntz.py
and cancel
method of polytools.py
resolving RecursionError
and Timeout
in limit evaluations.
#19680 : This PR added is_Pow
heuristic to limits.py
to improve the limit evaluations of Pow objects
.
#19697 : This PR rectified _eval_rewrite_as_tractable
method of class erf
.
#19716 : This PR added _singularities
to LambertW
function.
#18696 : This PR fixed errors in assumptions when rewriting RisingFactorial
/ FallingFactorial
as gamma
or factorial
.
#19741 : This PR reduced symbolic multiples of pi
in trigonometric functions
.
#19916 : This PR added _eval_nseries
method to sin
and cos
.
#19963 : This PR added _eval_is_meromorphic
method to the class BesselBase
.
#18656 : This PR added Raabe's Test
to the concrete
module.
#19990 : This PR added _eval_is_meromorphic
and _eval_aseries
methods to class lowergamma
,
_eval_is_meromorphic
and _eval_rewrite_as_tractable
methods to class uppergamma
and rectified the eval
method of class besselk
.
#20002 : This PR fixed _eval_nseries
method of log
.
This section contains some of my PRs related to miscellaneous issues.
#19447 : This PR added some required testcases to test_limits.py
.
#19537 : This PR fixed a minor performance
issue.
#19604 : This PR fixed AttributeError
in limit evaluation.
This section contains some of the PRs which were reviewed by me.
This section contains some of the issues which were opened by me.
Poly(E**100000000)
is slow to create.gammasimp
can be improved for integer
variables.This section describes the bugs fixed and the new features added during GSoC.
>>> from sympy import limit, limit_seq
>>> n = Symbol('n', positive=True, integer=True)
>>> limit(factorial(n + 1)**(1/(n + 1))  factorial(n)**(1/n), n, oo)
exp(1) # Previously produced 0
>>> n = Symbol('n', positive=True, integer=True)
>>> limit(factorial(n)/sqrt(n)*(exp(1)/n)**n, n, oo)
sqrt(2)*sqrt(pi) # Previously produced 0
>>> n = Symbol('n', positive=True, integer=True)
>>> limit(n/(factorial(n)**(1/n)), n, oo)
exp(1) # Previously produced oo
>>> limit(log(exp(3*x) + x)/log(exp(x) + x**100), x, oo)
3 # Previously produced 9
>>> limit((2*exp(3*x)/(exp(2*x) + 1))**(1/x), x, oo)
exp(1) # Previously produced exp(7/3)
>>> limit(sin(x)**15, x, 0, '')
0 # Previously it hanged
>>> limit(1/x, x, 0, dir="+")
zoo # Previously raised ValueError
>>> limit(gamma(x)/(gamma(x  1)*gamma(x + 2)), x, 0)
1 # Previously it was returned unevaluated
>>> e = (x/2) * (2*x**3  2*(x**3  1) * x**2 * digamma(x**3 + 1) + 2*(x**3  1) * x**2 * digamma(x**3 + x + 1) + x + 3)
>>> limit(e, x, oo)
1/3 # Previously produced 5/6
>>> a, b, c, x = symbols('a b c x', positive=True)
>>> limit((a + 1)*x  sqrt((a + 1)**2*x**2 + b*x + c), x, oo)
b/(2*a + 2) # Previously produced nan
>>> limit_seq(subfactorial(n)/factorial(n), n)
1/e # Previously produced 0
>>> limit(x**(2**x*3**(x)), x, oo)
1 # Previously raised AttributeError
>>> limit(n**(Rational(1, 1e9)  1), n, oo)
0 # Previously it hanged
>>> limit((1/(log(x)**log(x)))**(1/x), x, oo)
1 # Previously raised RecursionError
>>> e = (2**x*(2 + 2**(x)*(2*2**x + x + 2))/(x + 1))**(x + 1)
>>> limit(e, x, oo)
exp(1) # Previously raised RecursionError
>>> e = (log(x, 2)**7 + 10*x*factorial(x) + 5**x) / (factorial(x + 1) + 3*factorial(x) + 10**x)
>>> limit(e, x, oo)
10 # Previously raised RecursionError
>>> limit((x**2000  (x + 1)**2000) / x**1999, x, oo)
2000 # Previously it hanged
>>> limit(((x**(x + 1) + (x + 1)**x) / x**(x + 1))**x, x, oo)
exp(exp(1)) # Previously raised RecursionError
>>> limit(Abs(log(x)/x**3), x, oo)
0 # Previously it was returned unevaluted
>>> limit(x*(Abs(log(x)/x**3)/Abs(log(x + 1)/(x + 1)**3)  1), x, oo)
3 # Previously raised RecursionError
>>> limit((1  S(1)/2*x)**(3*x), x, oo)
zoo # Previously produced 0
>>> d, t = symbols('d t', positive=True)
>>> limit(erf(1  t/d), t, oo)
1 # Previously produced 1
>>> s, x = symbols('s x', real=True)
>>> limit(erf(s*x)/erf(s), s, 0)
x # Previously produced 1
>>> limit(erfc(log(1/x)), x, oo)
2 # Previously produced 0
>>> limit(erf(sqrt(x)x), x, oo)
1 # Previously produced 1
>>> a, b = symbols('a b', positive=True)
>>> limit(LambertW(a), a, b)
LambertW(b) # Previously produced b
>>> limit(uppergamma(n, 1) / gamma(n), n, oo)
1 # Previously produced 0
>>> limit(besselk(0, x), x, oo)
0 # Previously produced besselk(0, oo)
>>> e = (exp(x)  1)/x
>>> e.nseries(x, 0, 3)
1 + x/2 + x**2/6 + O(x**3) # Previously produced 1 + x/2 + O(x**2, x)
>>> e = (2/x + 3/x**2)/(1/x + 1/x**2)
>>> e.nseries(x, n=3)
3  x + x**2 + O(x**3) # Previously produced 3 + O(x)
>>> e = (x**2 + x + 1) / (x**3 + x**2)
>>> series(e, x, oo)
x**(5)  1/x**4 + x**(3) + 1/x + O(x**(6), (x, oo))
# Previously produced x**(5) + x**(3) + 1/x + O(x**(6), (x, oo))
>>> e = (1  1/(x/2  1/(2*x))**4)**(S(1)/8)
>>> e.series(x, 0, n=17)
1  2*x**4  8*x**6  34*x**8  152*x**10  714*x**12  3472*x**14  17318*x**16 + O(x**17)
# Previously produced 1  2*x**4  8*x**6  34*x**8  24*x**10 + 118*x**12  672*x**14  686*x**16 + O(x**17)
>>> asin(I*x + I*x**3 + 2)._eval_nseries(x, 3, None, 1)
asin(2) + pi  sqrt(3)*x/3 + sqrt(3)*I*x**2/9 + O(x**3)
>>> limit(log(I*x  1), x, 0, '')
I*pi
>>> n = symbols('n', integer=True)
>>> combsimp(RisingFactorial(10, n))
3628800*(1)**n/factorial(10  n) # Previously produced 0
>>> ff(5, y).rewrite(gamma)
120/Gamma(6  y) # Previously produced 0
>>> Sum(factorial(n)/factorial(n + 2), (n, 1, oo)).is_convergent()
True # Previously raised NotImplementedError
>>> Sum((n + (n**3 + 1)**(S(1)/3))/log(n), (n, 1, oo)).is_convergent()
True # Previously raised NotImplementedError
>>> f = log(x/(1  x))
>>> f.series(x, 0.491, n=1).removeO()
0.0360038887560022 # Previously raised ValueError
series
, nseries
, lseries
and aseries
._eval_is_meromorphic
method or _singularities
to as many special functions as possible.This summer has been a great learning experience and has helped me get a good exposure of testdriven development. I plan to continue contributing to SymPy and will also try to help the new contributors. I am grateful to my mentors, Kalevi Suominen and Sartaj Singh for reviewing my work, giving me valuable suggestions, and being readily available for discussions.


Key highlights
of this week’s work are:
_eval_nseries
method of log
to fix some minor issues.This marks the end of Phase3
of the program.
Finally, this brings us to the end of Google Summer of Code
and I am really thankful to the SymPy
Community and my mentor Kalevi Suominen
for always helping and supporting me.


GSoC 2020 Report Smit Lunagariya: Improving and Extending stats module


Key highlights
of this week’s work are:
Raabe's Test
helps to determine the convergence
of a series. It has been added to the concrete
module to handle those cases when the ratio test
becomes inconclusive
.
Fixed limit evaluations related to lowergamma, uppergamma and besselk function
We added _eval_is_meromorphic
method to class lowergamma
and uppergamma
so that some of the limits
involving lowergamma
and uppergamma
functions
get evaluated using the meromorphic check
already present in the limit codebase. Now, to make lowergamma
and uppergamma
functions tractable
for limit evaluations,
_eval_aseries
method was added to lowergamma
and _eval_rewrite_as_tractable
to uppergamma
.
Finally, we also rectified
the eval
method of class besselk
, so that besselk(nu, oo)
automatically evaluates to 0
.


This is the final blog of the official program highlighting the final week. Some of the key discussions were:


Key highlights
of this week’s work are:
Added _eval_is_meromorphic() to bessel function
_eval_is_meromorphic
method to the class BesselBase
so that some of the limits involving bessel functions
get evaluated using the meromorphic check
already present in the limit codebase.


This blog describes the 11th week of the program. Some of the key highlights of this week are:


Key highlights
of this week’s work are:
Fixed periodicity of trigonometric function
_peeloff_pi
method of sin
had to be rectified to reduce the symbolic multiples of pi in trigonometric functions.
As a result of this fix, now sin(2*n*pi + 4)
automatically evaluates to sin(4)
, when n
is an integer
.
Fixed limit evaluations related to trigonometric functions
_eval_nseries
method to both sin
and cos
to make the limit evaluations more robust when it comes to trigonometric functions.
We also added a piece of code to cos.eval
method so that the limit of cos(m*x)
, where m
is nonzero
, and x
tends to oo
evaluates to AccumBounds(1, 1)
.


This blogs describes the 10th week of the program. Some of the highlights of this week are:


This blogs describes the week 9, the beginning week of the final phase. This week, I continued to work on the extension of Compound Distributions as well as completing the Matrix Distributions. Some of the highlights of this week are:


This blog provides the brief description of last week of the second Phase i.e. week 8. Some of the key highlights of this week are:
This week, I spent most of my time reading about algorithms to parametrize algebraic curves and surfaces.
In many cases, it is easiar to define a region using an implicit equation over its parametric form. For example, a sphere can be defined with the implict eqution x^{2} + y^{2} + z^{2} = 4. Its parametric equation although easy to determine are tedious to write.
To integrate over implicitly defined regions, it is necessary to determine its parametric representation. I found the report on conversion methods between parametric and implicit curves from Christopher M. Hoffmann very useful. It lists several algorithms for curves and surfaces of different nature.
Every plane parametric curve can be expressed as an implicit curve. Some, but not all implicit curves can be expressed as parametric curves. Similarly, we can state of algebraic surfaces. Every plane parametric surface can be expressed as an implicit surface. Some, but not all implicit surfaces can be expressed as parametric surfaces.
One of the algorithm to parametrize conics is given below:
The difficult part is to find a point on the conic. For the first implementation, I just iterated over a few points and selected the point which satisfied the equation. I need to implement an algorithm that can find a point on the curve.
The parametric equations determined by the algoithm are large and complex expressions. The vector_integrate
function in most cases is taking too much time to calculate the integral. I tried fixing this using simplification functions in SymPy like trigsimp and expand_trig. The problem still persist.
>>> ellipse = ImplicitRegion(x**2/4 + y**2/16  1, x, y)
>>> parametric_region_list(ellipse)
[ParametricRegion((8*tan(theta)/(tan(theta)**2 + 4), 4 + 8*tan(theta)**2/(tan(theta)**2 + 4)), (theta, 0, pi))]
>>> vector_integrate(2, ellipse)
### It gets stuck
The algorithm does not work for curves of higher algebraic degree like cubics.
A monoid is an algebraic curve of degree n that has a point of multiplicity n 1. All conics and singular cubics are monoids. Parametrizing a monoid is easy if the special point is known.
To parametrize a monoid, we need to determine a singular point of n  1 multiplicity. The singular points are those points on the curve where both partial derivatives vanish. A singular point (xo, yo) of a Curve C is said to be of multiplicity n if all the partial derivatives off to order n  1 vanish there. This paper describes an algorithm to calculate such a point but the algorithm is not trivial.
My goal is to complete the work on parametrizing conics. If I can find a simple algorithm to determine the singular point of the required multiplicity, I will start working on it.
My goal for this week was to add support of integration over objects of geometry module.
In my GSoC proposal, I mentioned implementing classes to represent commonly used regions. It will allow a user to easily define regions without bothering about its parametric representation. SymPy’s geometry module has classes to represent commonly used geometric entities like line, circle, and parabola. Francesco told me it would be better to add support to use these classes instead. The integral takes place over the boundary of the geometric entity, not over the area or volume enclosed.
My first approach was to add a function parametric_region
to the classes of geometry module. Francesco suggested not to modify the geometry module as this would make it dependent on vector module. We decided to implement a function parametric_region
to return a ParametricRegion
of objects of geometry module.
I learned about the singledispatch decorater of python. It is used to create overloaded functions.
Polygons cannot be represented as a single ParametricRegion
object. Each side of a polygon has its separate parametric representation. To resolve this, we decided to return a list of ParametricRegion objects.
My next step was to modify vector_integrate
to support objects of geometry module. This was easy and involved calling the parametric_region_list
function and integrating each ParametricRegion
object.
The PR for the work has been merged. Note that we still do not have any way to create regions like disc and cone without their parametric representation. To support such regions, I think we need to add new classes to the geometry module.
I plan to work on implementing a class to represent implicitly defined regions.
The first phase of GSoC is over. We can now integrate the scalar or vector field over a parametrically defined region.
The PR for ParametricIntegral
class has been merged. In my last post, I mentioned about a weird error. It turns out that I was importing pi
as a symbol instead of as a number. Due to this, the expressions were returned unevaluated causing tests to fail.
I had a detailed discussion with Francesco about the route ahead. My plan has drifted from my proposal because we have decided not to implement new classes for special regions like circles and spheres. The geometry module already has classes to represent some of these regions. We have decided to use these classes.
Francesco reminded me that we have to complete the documentation of these changes. I was planning to complete the documentation in the last phase. But I will try completing it soon as I may forget some details.
We also discussed adding support to perform integral transformation using Stoke’s and Green’s theorem. In many cases, such transformation can be useful but adding them may be outside the scope of this project.
I have convinced Francesco on adding classes to represent implicitly defined regions. It might be the most difficult part of the project but hopefully, it will be useful to SymPy users.
My next week’s goal is to make a PR for adding suppor to integrate over objects of the geometry module.


Key highlights
of this week’s work are:
Fixed incorrect limit evaluation related to LambertW function
_singularities
feature to the LambertW
function so that its limit gets evaluated using the meromorphic check
already present in the limit codebase.
Fixed errors in assumptions when rewriting RisingFactorial / FallingFactorial as gamma or factorial
gamma
or factorial
methods of RisingFactorial
and FallingFactorial
did not handle all the possible cases, which caused errors in some evaluations.
Thus, we decided to come up with a proper rewrite using Piecewise
which accordingly returned the correct rewrite depending on the assumptions on the variables.
Handling such rewrites using Piecewise
is never easy, and thus there were a lot of failing testcases.
After spending a lot of time debugging and fixing each failing testcase, we were finally able to merge this.This marks the end of Phase2
of the program. I learnt a lot during these two months and gained many important things from my mentors.


This blog describes the 7th week of the program and the 3rd week of Phase 2. Some of the key highlights on the discussions and the implementations during this week are:


Key highlights
of this week’s work are:
Improved the limit evaluations of Power objects
This PR improves the limit evaluations of Power objects.
We first check if the limit expression is a Power object
and then accordingly evaluate the limit depending on different cases.
First of all, we express b**e
in the form of exp(e*log(b))
. After this, we check if e*log(b)
is meromorphic
and accordingly evaluate the final result.
This check helps us to handle the trivial cases in the beginning itself.
Now, if e*log(b)
is not meromorphic, then we separately evaluate the limit of the base and the exponent.
This helps us to determine the indeterminant form
of the limit expression if present.
As we know, there are 3 indeterminate forms corresponding to power objects: 0**0
, oo**0
, and 1**oo
, which need to be handled carefully.
If there is no indeterminate form present, then no further evaluations are required. Otherwise, we handle all the three cases separately and correctly evaluate the final result.
We also added some code to improve the evaluation of limits having Abs()
expressions.
For every Abs()
term present in the limit expression, we replace it simply by its argument or the negative of its argument, depending on
whether the value of the limit of the argument is greater than zero or less than zero for the given limit variable.
Finally, we were able to merge this after resolving some failing testcases.
Fixed limit evaluations involving error functions
The incorrect limit evaluations of error functions
were mainly because the tractable
rewrite was wrong and did not handle all the possible cases.
For a proper rewrite, it was required that the limit variable be passed to the corresponding rewrite method.
This is because, to define a correct rewrite we had to evaluate the limit of the argument of the error function
, for the passed limit variable.
Thus, we added a default argument limitvar
to all the tractable rewrite
methods and resolved this issue.
While debugging, we also noticed that the _eval_as_leading_term
method of error function was wrong, hence it was also fixed.
Finally, we were able to merge this after resolving some failing testcases.


This blog describes the 6th week of the official program and the 2nd week of Phase 2. By the end of this week, Compound Distributions framework is ready as targeted and I would now focus on the Joint Distributions in the upcoming weeks of this Phase.


Key highlights
of this week’s work are:
Fixed RecursionError and Timeout in limit evaluations
The Recursion Errors
in limit evaluations were mainly due to the fact that the indeterminant form of 1**oo
was not handled accurately in the mrv()
function of the
Gruntz algorithm
. So, some minor changes were required to fix those.
The major issue was to handle those cases which were timing out. On deep digging, we identified that the
cancel()
function of polytools.py
was the reason. Thus, we decided to completely transform the cancel()
function to speed up its algorithm.
Now after this major modification, many testcases were failing as the cancel()
function plays an important role in simplifying evaluations and
is thus used at many places across the codebase. Therefore, a lot of time was spent in debugging and rectifying these testcases.
Finally we were able to merge this, enhancing the limit evaluation capabilities of SymPy
.


This blogs describes the week 5, the beginning week of the Phase 2. Phase 2 will be mostly focused on Compound Distributions which were stalled from 2018, and additions to Joint Distributions.
I spent this week working on the implementation of the ParametricIntegral
class.
When I was writing the test cases, I realized that the API of the ParametricRegion could be improved. Instead of passing limits as a dictionary, tuples can be used.
So I modified the API of the ParametricRegion class. The new API is closer to the API of the integral
class and more intuitive. I made a separate PR for this change to make it easy for reviewers.
Example of the new API:
ParametricRegion( ( x+y, x*y ), (x, 0, 2), (y, 0, 2))
As discussed in previous posts, we decided to not associate a coordinate system with the parametric region. Instead, we assume that the parametricregion is defined in the coordinate system of the field of which the integral is being calculated. We calculate the position vector and normal vector of the parametric region using the base scalars and vectors of the fields. This works fine for most cases. But when the field does not contain any base scalar or vector in its expression, we cannot extract the coordinate system from the field.
ParametricIntegral(150, circle)
# We cannot determine the coordinate system from the field.
# To calculate the line integral, we need to find the derivative of the position vector.
# This is not possible until we know the base vector of the field
To handle this situation, I assign a coordinate system C to the region. This does not affect the result in any way as the result is independent of it. It just allows the existing algorithm to work in this case.
Francesco suggested making separate classes based on the nature of the field: vector and scalar. I am open to this idea. But I think it will be more easy and intuitive for the users if they can use the same class to calculate the integral. I do not think they are different enough from a user’s perspective to have a separate interface.
Maybe we can have a function vectorintegrate
which returns the object of ParametricVectorIntegral
or ParametricScalarIntegral
depending on the nature of the field. This can work for other types of integrals too. Suppose we implement a class ImplicitIntegral
to calculate the integral over an implicit region. The vectorintegrate
function can then return an object of ImplicitIntegral
object by identifying the region is defined implicitly. I think this will be great. I will have more discussion with Francesco on this aspect.
When evaluating double integral, the result some times depend upon the order in which the integral is evaluated. If the bounds of one parameter u
depend on another parameter v
, we should integrate first with respect to u
and then v
.
For example, consider the problem of evaluating the area of the triangle.
T = ParametricRegion((x, y), (x, 0, 2), (y, 10  5*x))
The area of the triangle is 10 units and should be independent of the order parameters are specified at the time of object initialization. But the double integral depends on the order of integration.
>>> integrate(1, (x, 0, 2), (y, 10  5*x))
20  10*x
>>> integrate(1, (y, 0, 10  5*x), (x, 0, 2))
10
So parameters must be passed to integrate
in the correct order. To overcome this issue, we topologically sort the parameters. SymPy already had a function to perform topologically sort in its utilities module. I implemented a function that generates the graph and passes it to the topological_sort
function. This made my work easy.
Some integrals are taking too long to compute. When base scalars in the field are replaced by their parametric equivalents, the expression of the field becomes large. Further, the integrand is the dot product of the field with a vector or product of the field and the magnitude of the vector. The integrate function takes about 2030 seconds to calculate the integral. I think this behavior is due to the expression of integrand growing structurally despite it being simple.
For example,
>>>solidsphere = ParametricRegion((r*sin(phi)*cos(theta), r*sin(phi)*sin(theta), r*cos(phi)),\
(r, 0, 2), (theta, 0, 2*pi), (phi, 0, pi))
>>> ParametricIntegral(C.x**2 + C.y**2, solidsphere)
In this case, the parametric field when replaced with parametersr become r**2*sin(phi)**2*sin(theta)**2 + r**2*sin(phi)**2*cos(theta)**2
although it can be easily simplified to r**2*sin(phi)
.
SymPy has a function called simplify
. simplify
attempts to apply various methods to simplify an expression. When the integrand is simplified using it before passing to integrate, the result is returned almost immediately. Francesco rightly pointed out that simplify is computationally expensive and we can try to some specific simplification. I will look into it.
Some test cases are failing because of integrate function returning results in different forms. The results are mathematically equivalent but different in terms of structure. I found this strange. I do not think this has to do with hashing. I still have not figured out this problem.
Hopefully, I will complete the work on the ParamatricIntegral and get it merged. We can then start discussing about representing implicit regions.
Part of this week was spent in getting the PR for ParametricRegion
merged. I made some fixes to get the failing tests to pass. I also added the docstring.
I started working on the ParametricIntegral
class. ParametricIntegral
class returns the integral of a scalar or vector field over a region defined parametrically. It should be able to calculate line, surface, or volume integral depending on the nature of the parametric region. To identify the nature of the region, the dimensions
property needs to be added in the ParametricRegion
class.
I started with writing the test cases to avoid my previous mistake of working on the implementation before deciding the API. The API of the ParametricIntegral class:
ParametricIntegral(field, parametricregion)
Some examples:
>>> curve = ParametricRegion((3*t  2, t + 1), {t: (1, 2)})
>>> ParametricIntegral(C.x, curve)
5*sqrt(10)/2
>>> semisphere = ParametricRegion((theta, pi), (2*sin(phi)*cos(theta), 2*sin(phi)*sin(theta), 2*cos(phi)),\
{theta: (0, 2*pi), phi: (0, pi/2)})
>>> ParametricIntegral(C.z, semisphere)
8*pi
I initially thought to make it a subclass of Integral
as the ParametricIntegral
is a kind of integral. But there was no reuse of any code from Integral
in the ParametricIntegral
so decided against it.
A rough sketch of the algorithm to calculate integral is described below:
integrate
function to calculate the integral.integrate
function can evaluate the integral, return it otherwise return an object of the ParametricRegion
class.We also had discussion on whether swapping the upper and lower limit has any affect on the result of a vector integral. For multiple integrals, the sign of the result changes when the upper and lower bound of a parameter are interchanged. I do not think we came to any conclusion. But this won’t matter for integrals over parametric regions as the bounds are directly defined by the user.
I will start working on the implementation of the ParametricIntegral
class. I am also trying to get more involved in SymPy outside of my GSoC project. I hope I can become a contributing member of the community.


Key highlights
of this week’s work are:
Adds cdir parameter to handle series expansions on branch cuts
Finally, after spending almost 2 weeks on this, we were able to merge the PR, adding a very important functionality of series expansions
on the branch cuts
to the codebase.
Previously, either SymPy
raised some error or the series expansion was computed incorrectly, when the value in the input was on the branch cut. But now, for most of the functions, the expansion produced is correct.
Not only this, we added the cdir
parameter to leadterm
and as_leading_term
functions as well. We even extended the functionality a bit to the limits
module, so now,
limits
of values lying on the branch cuts of a function are also computed correctly in most cases.
We are planning to extend this functionality to all the remaining special functions
and wherever else possible to make the codebase even more robust
.


With this blog it completes the awesome month of statistical learning and coding, and the official Phase 1 of the Google Summer of Code2020.


Key highlights
of this week’s work are:
Fixed incorrect evaluation caused due to subfactorial in limit_seq expression
subfactorial
term present in an expression into an equivalent term expressed in
the form of factorial
or gamma
was added which helped resolve the issue.
Adds cdir parameter to handle series expansions on branch cuts
Currently, many functions in the codebase are unable to produce correct series expansions
on branch cuts
. As a result,
the limit evaluation takes place incorrectly for these functions when the limiting value lies on the branch cuts.
Thus, we have decided to come up with a parameter named cdir
which stands for complex direction
and it indicates the direction from which the series expansion is required, thus helping us
to produce the correct series expansion. Special care needs to be taken while handling series expansions on the branch points
.
Once we are finished with this work, it will be a major enhancement
to the limit evaluation and series expansion capabilities of SymPy
.
This marks the end of Phase1
of the program. I learnt a lot during this one month and gained many important things from my mentors.
The first week of GSoC has been completed. My goal for this week was to add ParametricRegion class. Many common regions have well known parametric representation. Calculation of Integrals on parametric regions is a common problem of vector calculus.
The PR for adding the ParametricRegion class is #19472.
An object of ParametricRegion class represents a region in space defined in terms of parameters. If a user wants to perform integration on a region, they have to create a ParametricRegion object. This object can then be passed to other classes to perform integration.
The implementation part was easy. The difficulty was in deciding the API. I spent much of the week discussing the API, But it was worth it. The ParametricRegion class will most likely be used by many users so it is important for it to be simple and intuitive. I think I should have first completed writing test cases and examples.
The final API of ParametricRegion class:
ParametricRegion(parameters_or_coordsys, definition, limits=None)
Some Examples:
p = ParametricRegion(t, (t, t**3), limits={t: (1, 2)})
halfdisc = ParametricRegion((r, theta), (r*cos(theta), r* sin(theta)), {r: (2, 2), theta: (0, pi)})
cylinder = ParametricRegion((r, theta, z), (cos(theta), r*sin(theta), C.z), {theta: (0, 2*pi), z: (0, 4)})
This argument specifies the parameters of the parametric region. When there are more than one parameter, the user has to pass them as tuple. When a CoordSys3d object is passed, its base scalars are used as parameters.
definiton tuple maps the base scalars to thier parametric representation. It is not necessary for evey base scalar to have a parametric definition.
A parametric representation is just defining base scalars in terms of some parameters. If the base scalars are not knowm, the region cannot be completely specified. My intial approach was to include an argument to specify the coordinate system.
Passing a CoordSys3D object can be non intuitive for users. In most cases, the base scalars of the parametric region are the same as that of vector/scalar field. We can avoid passing the CoordSys3D object if we assume that the base scalars of parametric region field are same.
p = ParametricRegion((r, theta), (r*cos(theta), r*sin(theta)), limits={r: (0,1), theta: (0, pi)})
Integral(C.x**2*C.y, p)
# C.x = r*cos(theta), C.y = r*sin(theta), C.z = C.z
Integral(R.x*R.y*R.z, p)
# R.x = r*cos(theta), R.y = r*sin(theta), R.z = R.z
There can be a situation where SymPy cannot determine the coordinate system of the field. In such situations, we can raise a ValueError. For example:
Integral(R.x*C.z, p)
Most parametric region have bounds on the parameters. Note that it is not necessary for every parameter to have an upper and lower bound.
There can be many methods for a user to define bounds of parametric region. We decied to use a dictionary which we named limits. The keys of the dictionary are parameters and the values are tuples (lower bound, upper bound). Another approach is to use a nested bound tuple. Every subtuple will represent the bounds of a parameter.
My next week’s task is to get started with ParametricIntegral class. This class will represent integral of a scalar/vector field on a parametric surface. I will first write the unit tests and examples.
Planet SymPy is made from the blogs of SymPy's contributors. The opinions it contains are those of the contributor. This site is powered by Rawdog and Rawdog RSS. Feed readers can read Planet SymPy with RSS, FOAF or OPML.