-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Headers and Footers API #236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Woops. Looks like I had some python 3 issues. Fixed now. |
Hi @eupharis, looks like you've made some good progress. It probably makes sense for you to sketch out the API you're thinking of so we can consider it directly. You've correctly identified that as the key part of getting a commit. It's the one thing we can't change once it's been put in so needs to be gotten right the first time :) There are analysis pages under the docs directory, here's one that's related: There are a few others around this one that are fairly good examples: I use the word "protocol" in there to denote describing the API by showing a typical usage of it, rather than just saying what methods there are on what objects. I think that's a pretty good way to capture the API you propose. It would take too long and be too error-prone to pull that out of your draft code; better to spell it out so we can discuss it easily. I'm thinking one of these pages is a good place to gather that sort of thing because it will then become part of the documentation. It will also probably be the first item committed. There are other examples, some well developed, in the analysis documentation for python-pptx; here's a recent one: I recommend just starting out with the bits that seem helpful for communicating your thinking. I see these as documentation for my own use when I'm writing them. Things I need to get straight in my head or details I need to dig out of the schema or Microsoft API or whatever that I know I'll need when I get into the code. Then we can focus the discussion around a fairly well circumscribed area. |
Ok cool! I just pushed a protocol here: https://github.com/eupharis/python-docx/blob/master/docs/dev/analysis/features/header-footer.rst It looks slightly better via local Sphinx but the github rendering is pretty good. It includes a write-up of my notes of how the hell docx headers work. I will definitely reference it in the future. I modeled the protocol format after this one: http://python-docx.readthedocs.org/en/latest/dev/analysis/features/text/font-color.html |
Ping! |
Hi @eupharis, apologies, I'm not actively working on this project at the moment (other projects call as you will understand I'm sure), so it's taken a while to get to this. Thanks for writing up the analysis page. This helps a lot in getting details ironed out before you go in to do the actual implementation. I've been surprised a bit to discover that the implementation generally turns out to be the easy part; it's the discovery, analysis, and resolution up front that take the time and all the noodling :) The approach you suggest in the analysis page appears to be inconsistent with the existing approach for that sort of thing and also how Word handles the broader bit about headers and footers. I just realized I had started some analysis work on headers/footers a while back. I just posted it here, you might want to have a look: I think the first thing is that headers and footers have a natural affinity for sections, rather than the document as a whole. Most documents probably only have a single section, but of course the API needs to work for the general case. So the first step in the protocol would be something like this I'm pretty sure: >>> document = Document('xyx.docx')
>>> section = document.sections[-1] #last section, for example
>>> odd_page_header = section.primary_header
>>> odd_page_header
<docx.header.Header object at 0x1029a6820>
>>> Another item that jumps out is that there can be as many as three headers and/or footers for each section, one for odd pages (the primary header/footer), one for even pages (if laying out verso/recto for binding), and a possibly different first-page header/footer. So the API will need to both provide access to those and provide properties like I think you'll want to take a look at the MS API for headers and footers to see what they do. We usually find it pretty useful to model after that API, although there are important departures in certain cases to make things more Pythonic or just more rational. In any case it's useful to understand the MS API protocol as a key input to the analysis. I know in this case they maintain a collection accessed by indices 1, 2, or 3. I'm thinking we just make three properties (six actually when you include footers, e.g. You'll also want to pull together the key parts of the XML schema along with examples/snippets. You can find plenty of examples of those in the other analysis pages. You might be able to reuse some of the bits from sections here, since they're so closely related: Also have a look at this issue for some ideas and prior work: Let us know how you go :) |
Ok awesome! I totally understand about other projects hah. This is all super helpful. I'm currently using this code on a project of mine, and I've run into the problem with this v1 version of headers not playing nicely with sections. The approach you outline above should work much better. I've blocked out some time next week and will dig into this further! |
Next week turned out to be rather optimistic ;) I merged in all your I spelled out the XML that should exist for all the different all/even/odd/first header scenarios: I also updated the analysis I did so that Your I think we are getting close with the API?!? |
I totally agree that defining all the API and understanding the XML structure and writing the analysis is the hard part. This is actually pretty simple to implement now. I could port all my |
It's very counterintuitive that way, isn't it? It still surprises me when I do it :) I'll take a look at your analysis page and provide some feedback. |
Ok @eupharis, I've taken a look and have some remarks I want to add in as comments. What I'll need is to have this file in a commit of its own so I can make inline comments. And it needs to be the whole file, all at once. If there are changes spread across multiple commits it breaks up the text and it's impossible to trace through. What I'm thinking probably makes sense is for you to create a new, separate pull request that we use for moving this to commit/merge. We can keep this one open as it might come in handy. The new one will be built up commit by commit, essentially getting each one ready to merge before bringing in the next one. How's your Git? You'll be needing it here I expect. Let me know if you need help on some part as we go. Here's what we'll need. The new branch will be called feature/header. It will be based (rooted) at python-openxml/master. All commits submitted for review and later merging will appear on that branch, which will be eupharis/feature/header once you push it to your repo. On your local it will likely be origin/feature/header, but that can vary depending on where you cloned from. This first commit should be named: 'docs: document header/footer analysis'. It will remain a single commit until merging, at least on your GitHub repo. You can keep it any way you like locally as you're the only one who sees that and occasionally it makes sense to have multiple commits and then squash them or whatever before pushing. In order to overwrite the pull request branch (eupharis/feature/header) with changes to an existing commit, you'll need to push with the --force option. After a forced push, the pull request will automatically show the updated version. The comments from me (often extensive) from the prior version of that commit will be gone I believe, so make sure you get what you need before force pushing :) In general there will be no merges in the pull request branch; the commit history will remain entirely linear. Likewise, there will be no "fixed this, fixed that" commits. Changes will be rebased into place as necessary. If changes need to be made to a commit that's already been merged, we'll talk and work it out. Take a browse through the commit history for the latent styles features added starting here: e7930e4 You'll see a pattern that happens over and over again, and should happen here too:
You have to go back about that far to see it, because I did a bit of reorganization before the last release and that has a very different form (and it's not done very often :). You'll notice a few things:
Let me know if you need more to go on. Otherwise I'll review and remark on your analysis page once it's up :) |
Ok new PR created! I'm curious why we are basing it off of |
Basing it off of feature/header is even better, by all means, let's do that :) |
Perfect! Here we go: I spent a fair while this morning playing with |
Closing this one for now since we have a new, later one open for header/footer. |
This PR is way not ready to actually be merged. Just want to get some feedback on the API before I go farther.
tests/expanded_docx/
. I skipped them for now.)Here's a sample:
Fuller API notes below in
docs/header_footer.rst
.