-
-
Notifications
You must be signed in to change notification settings - Fork 66
Fixing parsing for FormBox #545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 3 commits
64c9295
3d83960
f15fe6c
73926ff
f3920f0
f6f3d1f
9a2680e
9485d9d
0cfd7d0
ca7dd9b
78fabca
65cfbac
8fcf8de
72bd2b3
89d74ee
f77fae3
dd57ebe
a67a6af
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -308,13 +308,35 @@ def p_RawLeftAssociation(self, token): | |
| def p_LeftRowBox(self, token): | ||
| self.consume() | ||
| children = [] | ||
| self.box_depth += 1 | ||
| self.bracket_depth += 1 | ||
| # If this does not happend, it would be because | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. happend -> happen I am not sure though that I understand. Is the following the same?: If this construct appears inside a So this implies there could be things other than a FormBox that can occur too. OtherscriptBox? Span? What are the merits of using an
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I try to explain the problem in the comments, but it seems it is not easy. Here we go again: the problem is that Then, to build the expression, we need to know all the tokens before
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As far as I could see, the unique case with this special behavior was
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| # it was called when a `FormBox` was found. | ||
| if token.tag == "LeftRowBox": | ||
| self.box_depth += 1 | ||
| self.bracket_depth += 1 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Every place there is some depth that increases by one, there should be another place that decreases the depth by one. Where is that?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the tat is not
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You are saying something more or less like what the code says. Why is the decrement not reached. What is the depth used for?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The deep was used to handle nested expressions like When a I tried to find a way to put the increment and the decrement together, but it just was more involved than this approach: I should have allowed the increment of the deep variables, and then doing the decrement before returning the FormBox. I didn't think that it was clearer.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I hope the new comment helps to understand the logic.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the description. Something about this seems a little to complicated and that there should be something that follows the specification in a little more direct way:
I would like to think about and reflect on this some more. I think we can make this cleaner, clearer and simpler. What would be ideal is if we could do a cleanup step first - no changes to behavior yet. And after this, then come up with a way to implement this.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Actually, in WMA,
I am sure it is possible, just that after trying for a while, I didn't find a better way.
What would be a cleanup step? The list of (simple) tests to check are the tests that I added in test/core/parser/test_parser.py.
OK, in that case, I am going to put this as a draft for a while. |
||
|
|
||
| token = self.next() | ||
| while token.tag not in ("RightRowBox", "OtherscriptBox"): | ||
| if token.tag in "FormBox": | ||
|
TiagoCavalcante marked this conversation as resolved.
Outdated
|
||
| break | ||
| newnode = self.parse_box(0) | ||
| children.append(newnode) | ||
| token = self.next() | ||
|
|
||
| if token.tag == "FormBox": | ||
| if len(children) == 0: | ||
| fmt = Symbol("StandardForm") | ||
| elif len(children) == 1: | ||
| fmt = children[0] | ||
| if type(fmt) is String: | ||
| fmt_name = fmt.value | ||
| if is_symbol_name(fmt_name): | ||
| fmt = Symbol(fmt_name) | ||
| else: | ||
| fmt = Node("Removed", Symbol("$$Failure")) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rocky,you are right: it should be to be compatible with WMA. For example: In[1]:= (3 ` + b) Out[1]= FormBox[RowBox[{+, b}], Removed[$$Failure]] In[2]:= %//FullForm Out[2]//FullForm= FormBox[RowBox[List["+", "b"]], Removed["$$Failure"]]
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in f6f3d1f |
||
| else: | ||
| fmt = Node("RowBox", Node("List", *children)) | ||
| rest = self.p_LeftRowBox(token) | ||
| return Node("FormBox", rest, fmt) | ||
| if len(children) == 0: | ||
| result = String("") | ||
| elif len(children) == 1: | ||
|
|
@@ -840,20 +862,6 @@ def b_FractionBox(self, box1, token, p): | |
| box2 = self.parse_box(q + 1) | ||
| return Node("FractionBox", box1, box2) | ||
|
|
||
| def b_FormBox(self, box1, token, p): | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why was this removed?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because with the current logic, to implement the behavior of
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this is what I mean about the myopic kind of thing. FormBox needs to worry about precedence and the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I was aware about the possible issue with the precedence. So I checked it against WMA, and it also seems to neglect the operator precedence of What makes me think that this is the right approach (or at least a better approach than the current one) is that it does not break the existing tests, and it just makes a difference if a |
||
| q = misc_ops["FormBox"] | ||
| if q < p: | ||
| return None | ||
| if box1 is None: | ||
| box1 = Symbol("StandardForm") # RawForm | ||
| elif is_symbol_name(box1.value): | ||
| box1 = Symbol(box1.value, context=None) | ||
| else: | ||
| box1 = Node("Removed", String("$$Failure")) | ||
| self.consume() | ||
| box2 = self.parse_box(q) | ||
| return Node("FormBox", box2, box1) | ||
|
|
||
| def b_OverscriptBox(self, box1, token, p): | ||
| q = misc_ops["OverscriptBox"] | ||
| if q < p: | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -694,10 +694,14 @@ def testFraction(self): | |
| self.check("\\( \\/ \\)", 'FractionBox["", ""]') | ||
|
|
||
| def testFormBox(self): | ||
| self.check("\\( 1 \\` b \\)", 'FormBox["b", Removed["$$Failure"]]') | ||
| self.check("\\( \\` b \\)", 'FormBox["b", StandardForm]') | ||
| self.check("\\( a \\` b \\)", 'FormBox["b", a]') | ||
| self.check("\\( a \\` \\)", 'FormBox["", a]') | ||
| self.check("\\( a \\` b + c \\)", 'FormBox[RowBox[{"b", "+", "c"}], a]') | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A small thing but I would appreciate it if you' use raw strings here as well. Thanks.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just followed the convention used in this file. But I agree, it would be clearer with raw strings.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in f6f3d1f |
||
| self.check("\\( a \\` b \\` c \\)", 'FormBox[FormBox["c", b], a]') | ||
| self.check('\\( "a" \\` b \\)', 'FormBox["b", Removed[$$Failure]]') | ||
| self.check("\\( 3.2 \\` b \\)", 'FormBox["b", Removed[$$Failure]]') | ||
| self.check("\\( 3.2 + a \\` b \\)", 'FormBox["b", RowBox[{"3.2", "+", "a"}]]') | ||
|
|
||
| def testRow(self): | ||
| self.check("\\( \\)", String("")) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been looking at this and notice a couple of small formatting changes. If this docstring were marked as a raw string, e.g.
r""" ... """we would need all of the ugly and confusing double backslashes\\everywhere..On line 333:
has a hard line break which the current homegrown formatter doesn't ignore. And it would be nice to add an extra space between and end
</dt>and the next starting<dd>.These aren't strictly about this PR, but I happen to notice since you happen to be in the area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Also, I was thinking about to patching the homegrown formatter to handle these line breaks, and then avoiding that flake8 complains about long lines. But one step at the time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in f6f3d1f