Restart numbering of an ordered list in document. #25

proofit404 · 2014-03-18T08:47:51Z

We can easily add ordered list with document.add_paragraph(style='ListNumber') code. But how can we restart its numbering?

The text was updated successfully, but these errors were encountered:

lafolle · 2014-08-06T10:53:28Z

+1

ilnurgi · 2014-08-15T05:49:51Z

+1 :)

downtown12 · 2015-07-06T02:58:56Z

Hi @scanny . I have a problem with workaround way you mentioned in stackoverflow.com. In order to restarting the list numbering, you said one need to locate the numbering definition of the style as well as the abstract numbering definition it points to. I tried but failed to locate these two definitions. Can you explain in detail how they can be located? Cuz I haven't found any documents about them.

scanny · 2015-07-06T20:50:48Z

Can you link to the SO response? that would save me from taking the time to search it up.

downtown12 · 2015-07-07T01:41:44Z

Here is the link, the same as the link you answered for issue #87 : http://stackoverflow.com/questions/23446268/python-docx-how-to-restart-list-lettering/23464442#23464442

scanny · 2015-07-07T03:43:13Z

@downtown12 I don't have time to develop step-by-step instructions for you, but if you're looking for the right direction to start looking for yourself, these might be helpful:

The document object provides access to the numbering part for the document (numbering.xml). This call should get you there: document.part.numbering_part
The NumberingPart object provides access to a sequence of NumberingDefinition objects: numbering_part.numbering_definitions
The NumberingDefinitions class looks like where API support ends at the moment. You can get the w:numbering element from it though, which is the parent of all the numbering definitions: numbering_definitions._numbering. From there you'll need to work with lxml calls to get it's children using XPath and so on.

downtown12 · 2015-07-07T06:09:46Z

Thanks a lot @scanny .Your instructions help me connect the attributes you mentioned above with the elements of numbering.xml. They are pretty helpful.

yurac · 2015-09-15T09:20:32Z

@downtown12, I also use this feature. I see that all infrastructure is already in python-docx. Enabling it is easy - Below is a patch. I also added accessors for numId, ilvl and ind.left for
convenience

diff --git a/docx/document.py b/docx/document.py
index 655a70e..4a0ec06 100644
--- a/docx/document.py
+++ b/docx/document.py
@@ -51,6 +51,12 @@ class Document(ElementProxy):
         paragraph.add_run().add_break(WD_BREAK.PAGE)
         return paragraph

+    def get_new_list(self, abstractNumId):
+        """
+        Returns a new numId that references given abstractNumId
+        """
+        return self.numbering.numbering_definitions.add_num(abstractNumId, True)
+
     def add_paragraph(self, text='', style=None):
         """
         Return a paragraph newly added to the end of the document, populated
@@ -157,6 +163,14 @@ class Document(ElementProxy):
         return self._part.styles

     @property
+    def numbering(self):
+        """
+        A "Provides access to numbering part
+        """
+        x=self._part.numbering_part
+        return self._part.numbering_part
+
+    @property
     def tables(self):
         """
         A list of |Table| instances corresponding to the tables in the
diff --git a/docx/oxml/numbering.py b/docx/oxml/numbering.py
index aeedfa9..097f3d6 100644
--- a/docx/oxml/numbering.py
+++ b/docx/oxml/numbering.py
@@ -96,13 +96,15 @@ class CT_Numbering(BaseOxmlElement):
     """
     num = ZeroOrMore('w:num', successors=('w:numIdMacAtCleanup',))

-    def add_num(self, abstractNum_id):
+    def add_num(self, abstractNum_id, restart=False):
         """
         Return a newly added CT_Num (<w:num>) element referencing the
         abstract numbering definition identified by *abstractNum_id*.
         """
         next_num_id = self._next_numId
         num = CT_Num.new(next_num_id, abstractNum_id)
+        if restart:
+            num.add_lvlOverride(ilvl=0).add_startOverride(1)
         return self._insert_num(num)

     def num_having_numId(self, numId):
diff --git a/docx/parts/numbering.py b/docx/parts/numbering.py
index e324c5a..186c6f8 100644
--- a/docx/parts/numbering.py
+++ b/docx/parts/numbering.py
@@ -43,5 +43,8 @@ class _NumberingDefinitions(object):
         super(_NumberingDefinitions, self).__init__()
         self._numbering = numbering_elm

+    def add_num(self, abstractNum_id, restart=False):
+        return self._numbering.add_num(abstractNum_id, restart).numId
+
     def __len__(self):
         return len(self._numbering.num_lst)

downtown12 · 2015-09-18T17:07:35Z

@yurac Thanks.
Actually, I solved my issue serveral days ago in another work-around way.
I didn't walk the way that changing the original docx packages but adding some features in the app of myself.
But I wrote much more amount of code than you did.
I read the document_part as well as the numbering_part from the Document object that I was editing.
Firstly I parse the document_part using lxml, once I extract one numbering list paragraph, I create a abstract number node for it in the numbering_part. So that can make sure every list has its numbering restarted.
It seems every road lead to Rome.
Thanks very much for supplying me a new way.

yurac · 2015-09-19T08:47:02Z

@downtown12, I also thought about adding new abstract number node per list, but was too lazy to implement it:)

Thanks!

jmhansen · 2015-09-23T21:02:48Z

@yurac, do you plan to submit a pull request?

yurac · 2015-09-24T07:08:09Z

@nesnahnoj , I submitted one: #210

jmhansen · 2015-09-24T14:48:18Z

Thanks @yurac. Can you explain how you are accessing/using numId, ilvl and ind.left? I've been using the same command described in the original post of this issue: document.add_paragraph(style='ListNumber'), but I want to solve for the following use case:

Add the list of strings [a, b, c, d, e, f] to a docx file in the following outline format:

a
1. b
2. c
d
1. e
f

yurac · 2015-09-24T15:22:28Z

You have to set ilvl and numid for each item. You can set them using the accessors in the pull request. You need to understand class relationship in python-docx to do so. Generate numid using the function in pull request using the abstract numid as parameter. Get abstract numid from your template file. For your example you do not need to set ind.left. I will try to get you some example when i have time.

yurac · 2015-09-24T15:42:00Z

@nesnahnoj

jeberger · 2015-09-25T13:52:34Z

This appears to work reasonably well for numbered lists, but not so good for multi-levels bullet lists. From my tests, it looks like you need to have a w:abstractNum and a w:num in numbering.xml. Moreover, the w:abstractNum must have a w:lvl for each ilvl you use.

Those are used to define the list style (bullet character and indentation, mostly). When they are missing, it looks like Word picks reasonable defaults for a numbered list but not for a bullet list.

@yurac, PR #110 generates random values for numId and checks that they are not already used in the document text. However it does not check whether they are defined in numbering.xml. I suspect that this might lead to funny behaviour if you pick a number that has a definition in numbering.xml even though it is unused in the text. Especially if that definition is for the wrong list type…

yurac · 2015-09-25T14:00:53Z

@nesnahnoj here is code that should generate the example you asked:

#!/usr/bin/env python

from docx import Document

document = Document("template.docx")

# Add desired numbering styles to your template file.
# Extract abstractNumId from there. In this example, abstractNumId is 10
numId = document.get_new_list("10")

# Add a list
p = document.add_paragraph(style = 'ListParagraph', text = "a")
p.num_id = numId
p.level = 0
p = document.add_paragraph(style = 'ListParagraph', text = "b")
p.num_id = numId
p.level = 1
p = document.add_paragraph(style = 'ListParagraph', text = "c")
p.num_id = numId
p.level = 1
p = document.add_paragraph(style = 'ListParagraph', text = "d")
p.num_id = numId
p.level = 0
p = document.add_paragraph(style = 'ListParagraph', text = "e")
p.num_id = numId
p.level = 1
p = document.add_paragraph(style = 'ListParagraph', text = "f")
p.num_id = numId
p.level = 0

# Restart numbering at the outer level
numId = document.get_new_list("10")

# Add the same list once again. The numbering is restarted
p = document.add_paragraph(style = 'ListParagraph', text = "a")
p.num_id = numId
p.level = 0
p = document.add_paragraph(style = 'ListParagraph', text = "b")
p.num_id = numId
p.level = 1
p = document.add_paragraph(style = 'ListParagraph', text = "c")
p.num_id = numId
p.level = 1
p = document.add_paragraph(style = 'ListParagraph', text = "d")
p.num_id = numId
p.level = 0
p = document.add_paragraph(style = 'ListParagraph', text = "e")
p.num_id = numId
p.level = 1
p = document.add_paragraph(style = 'ListParagraph', text = "f")
p.num_id = numId
p.level = 0

document.save("num.docx")

yurac · 2015-09-25T14:03:41Z

@jeberger multi-level bullet list work for me the same way. I create an abstractNumId for such a list using Word and then reference it the same way I do for numbered lists

jeberger · 2015-09-25T14:38:24Z

@yurac and the result when opened in word looks like:

1st item of 1st level list
1st item of 2nd level list
2nd item of 2nd level list
2nd item of 1st level list

instead of:

1st item of 1st level list
- 1st item of 2nd level list
- 2nd item of 2nd level list
2nd item of 1st level list

jeberger · 2015-09-25T14:40:51Z

Note that this depends on your docx template. My results are with the default.docx that comes with v.0.8.5

yurac · 2015-09-25T14:55:06Z

@jeberger I do not use default.docx. I created a template and added there a numbered multi level list and a bulleted multi level list of the style I want to use. Then I opened the template file numbering.xml component and found there the cirresponding abstractNumId's for the numbered list and the bulleted list I created. Now I always use these ids together with the pull request I posted above and I get the expected results for both numbered and bulleted multi level lists

jeberger · 2015-09-25T15:18:19Z

That's more or less the point I wanted to make: this requires a specific docx template and code that is tailored to use the same IDs as this template. It will not work with a generic template, and you may have to change the IDs in the code each time you change the template.

yurac · 2015-09-25T15:27:48Z

Yes. I think a better approach would be adding abstractNumId per list. However there is currently no API for adding new abstractNumId so you can only use ones defined in your template

akobler · 2016-06-06T09:46:01Z

+1

ameily · 2017-04-18T23:31:09Z

I need this functionality for a project. It looks like there are two PRs open, #110 and #210, and both are awaiting tests. Can I step in and help and, if so, which PR should I go off of? I personally like the API in #110 better.

scanny · 2017-04-21T18:58:45Z

Hi Adam, you're more than welcome to contribute. The key thing that stops most folks is getting the tests done. Python folks just don't seem to be test-driven as a whole and this project is strictly test-driven. No commit gets merged without tests. You can look at the commit history to get an idea the granularity and flow of how these feature additions go. Generally look for a commit that starts with 'docs: document analysis for ...' followed by one starting 'acpt: add scenarios for xxx'. Then the implementation follows and then the pattern repeats.

The first step is developing the enhancement proposal, also known as the "analysis page". This is where the API is resolved and so it's a natural first step. It's also separately committable, whether you go on to develop the feature or not. After that, you can pick and choose whatever you want from the existing pull requests to use in yours.

Neither one of the two PRs look like they went in the right direction, which is typically what happens when you don't start with the analysis document. There's a lot to think through and inputs to be collected and understood, like the relevant XML Schema excerpts, Word's own behaviors with respect to lists, what the spec has to say about it (little generally :), and what the MS API is for doing the same from VBA.

Let me know if you need more to go on :)

Sebastancho · 2017-08-01T17:38:25Z

Any updates on this issue or we can contrib?

cm-cm-cm-cm · 2018-06-05T09:19:51Z

Are there any updates to this?

chaithanyaramkumar · 2018-06-20T05:40:44Z

how to read bullets or numbering in an existing document
for example
input is
1.apple
2.boy
output is
['1.apple','2.boy']

madphysicist · 2018-06-27T18:37:25Z

So I've made a thing that searches the existing abstract numbering schemes for the style of the current paragraph, and sets a reasonable abstract style based on that if possible:

def list_number(doc, par, prev=None, level=None, num=True):
    """
    Makes a paragraph into a list item with a specific level and
    optional restart.

    An attempt will be made to retreive an abstract numbering style that
    corresponds to the style of the paragraph. If that is not possible,
    the default numbering or bullet style will be used based on the
    ``num`` parameter.

    Parameters
    ----------
    doc : docx.document.Document
        The document to add the list into.
    par : docx.paragraph.Paragraph
        The paragraph to turn into a list item.
    prev : docx.paragraph.Paragraph or None
        The previous paragraph in the list. If specified, the numbering
        and styles will be taken as a continuation of this paragraph.
        If omitted, a new numbering scheme will be started.
    level : int or None
        The level of the paragraph within the outline. If ``prev`` is
        set, defaults to the same level as in ``prev``. Otherwise,
        defaults to zero.
    num : bool
        If ``prev`` is :py:obj:`None` and the style of the paragraph
        does not correspond to an existing numbering style, this will
        determine wether or not the list will be numbered or bulleted.
        The result is not guaranteed, but is fairly safe for most Word
        templates.
    """
    xpath_options = {
        True: {'single': 'count(w:lvl)=1 and ', 'level': 0},
        False: {'single': '', 'level': level},
    }

    def style_xpath(prefer_single=True):
        """
        The style comes from the outer-scope variable ``par.style.name``.
        """
        style = par.style.style_id
        return (
            'w:abstractNum['
                '{single}w:lvl[@w:ilvl="{level}"]/w:pStyle[@w:val="{style}"]'
            ']/@w:abstractNumId'
        ).format(style=style, **xpath_options[prefer_single])

    def type_xpath(prefer_single=True):
        """
        The type is from the outer-scope variable ``num``.
        """
        type = 'decimal' if num else 'bullet'
        return (
            'w:abstractNum['
                '{single}w:lvl[@w:ilvl="{level}"]/w:numFmt[@w:val="{type}"]'
            ']/@w:abstractNumId'
        ).format(type=type, **xpath_options[prefer_single])

    def get_abstract_id():
        """
        Select as follows:

            1. Match single-level by style (get min ID)
            2. Match exact style and level (get min ID)
            3. Match single-level decimal/bullet types (get min ID)
            4. Match decimal/bullet in requested level (get min ID)
            3. 0
        """
        for fn in (style_xpath, type_xpath):
            for prefer_single in (True, False):
                xpath = fn(prefer_single)
                ids = numbering.xpath(xpath)
                if ids:
                    return min(int(x) for x in ids)
        return 0

    if (prev is None or
            prev._p.pPr is None or
            prev._p.pPr.numPr is None or
            prev._p.pPr.numPr.numId is None):
        if level is None:
            level = 0
        numbering = doc.part.numbering_part.numbering_definitions._numbering
        # Compute the abstract ID first by style, then by num
        anum = get_abstract_id()
        # Set the concrete numbering based on the abstract numbering ID
        num = numbering.add_num(anum)
        # Make sure to override the abstract continuation property
        num.add_lvlOverride(ilvl=level).add_startOverride(1)
        # Extract the newly-allocated concrete numbering ID
        num = num.numId
    else:
        if level is None:
            level = prev._p.pPr.numPr.ilvl.val
        # Get the previous concrete numbering ID
        num = prev._p.pPr.numPr.numId.val
    par._p.get_or_add_pPr().get_or_add_numPr().get_or_add_numId().val = num
    par._p.get_or_add_pPr().get_or_add_numPr().get_or_add_ilvl().val = level

This is in no way comprehensive or particularly robust, but it works pretty well for what I am trying to do.

chaithanyaramkumar · 2018-07-03T04:22:55Z

thanks

chaithanyaramkumar · 2018-07-13T05:47:55Z

In a document have one table
we need to read first row is col header is true otherwise its false
also first col is row header otherwise its false
please answer me.

nitinkhosla79 · 2020-04-02T17:48:29Z

All kudos to @jlovegren0. He added low-level support for numbering styles
We are able to generate multilevel bullet lists now.
PR: #582
We are helping him test his PR. Requesting everyone to please try his api/sample code(in PR comments) and share your success story as well. OR provide feedback.
Thank you all for your support.

komawar · 2020-04-02T18:10:35Z

Yes, excellent work indeed by @jlovegren0

Nested numbering lists work with correct numbering.

komawar · 2020-04-06T02:32:06Z

We have a good discussion on PR: #582
This should lay out different possibilities on the implementation of the problem and some of the corner cases/challenges therein

again approbation for @jlovegren0 for all the crafted inputs and workarounds

VictorBancho · 2023-07-24T07:01:11Z

@yurac Thanks. Actually, I solved my issue serveral days ago in another work-around way. I didn't walk the way that changing the original docx packages but adding some features in the app of myself. But I wrote much more amount of code than you did. I read the document_part as well as the numbering_part from the Document object that I was editing. Firstly I parse the document_part using lxml, once I extract one numbering list paragraph, I create a abstract number node for it in the numbering_part. So that can make sure every list has its numbering restarted. It seems every road lead to Rome. Thanks very much for supplying me a new way.

Hi @downtown12, I know it has been quite a few years since you posted this but I have encountered this issue and wanted to try your solution. Would you mind elaborating on how you parsed the document_part (which property?) using lxml and located the numbering list paragraphs? Furthermoer, how did you manage to create and where/how to insert the abstract number nodes?

I've iterated through the document.part._element, and its children, but that didn't seem to show anything insightful (no list number objects etc even though the document has lists). Wasn't clear if the docs for python-docx also elaborated on how to do this.

scanny modified the milestones: v0.6.0, 0.6.3 May 1, 2014

scanny modified the milestones: v0.8.0, 0.6.3 May 13, 2014

scanny added the numbering label Jun 17, 2014

scanny mentioned this issue Aug 15, 2014

trouble with listnumers #87

Closed

mohamedattahri mentioned this issue Nov 27, 2014

Added new add_list method to enhance lists support. #110

Closed

2 tasks

photuris mentioned this issue Jan 29, 2016

Add ability to restart numbering #210

Open

scanny removed this from the Sections milestone Apr 9, 2016

nikeqiang mentioned this issue Feb 7, 2018

Feature Request: Paragraph.get_listnum() #471

Open

jlovegren0 mentioned this issue Dec 1, 2018

Adding low-level support for numbering styles. #582

Open

lilyzhaochina mentioned this issue Jan 16, 2019

How to get automatic caption numbering? #600

Open

stdedos mentioned this issue Nov 2, 2020

Nested lists behavior is different in LibreOffice and in MS Word 2016 #893

Closed

renatodamas mentioned this issue Dec 12, 2024

Correctly reading enumeration and list numbers/letters #1454

Open

Restart numbering of an ordered list in document. #25

Restart numbering of an ordered list in document. #25

Comments

proofit404 commented Mar 18, 2014

lafolle commented Aug 6, 2014

Uh oh!

ilnurgi commented Aug 15, 2014

Uh oh!

downtown12 commented Jul 6, 2015

Uh oh!

scanny commented Jul 6, 2015

Uh oh!

downtown12 commented Jul 7, 2015

Uh oh!

scanny commented Jul 7, 2015

Uh oh!

downtown12 commented Jul 7, 2015

Uh oh!

yurac commented Sep 15, 2015

Uh oh!

downtown12 commented Sep 18, 2015

Uh oh!

yurac commented Sep 19, 2015

Uh oh!

jmhansen commented Sep 23, 2015

Uh oh!

yurac commented Sep 24, 2015

Uh oh!

jmhansen commented Sep 24, 2015

Uh oh!

yurac commented Sep 24, 2015

Uh oh!

yurac commented Sep 24, 2015

Uh oh!

jeberger commented Sep 25, 2015

Uh oh!

yurac commented Sep 25, 2015

Uh oh!

yurac commented Sep 25, 2015

Uh oh!

jeberger commented Sep 25, 2015

Uh oh!

jeberger commented Sep 25, 2015

Uh oh!

yurac commented Sep 25, 2015

Uh oh!

jeberger commented Sep 25, 2015

Uh oh!

yurac commented Sep 25, 2015

Uh oh!

akobler commented Jun 6, 2016

Uh oh!

ameily commented Apr 18, 2017

Uh oh!

scanny commented Apr 21, 2017

Uh oh!

Sebastancho commented Aug 1, 2017

Uh oh!

cm-cm-cm-cm commented Jun 5, 2018

Uh oh!

chaithanyaramkumar commented Jun 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madphysicist commented Jun 27, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chaithanyaramkumar commented Jul 3, 2018

Uh oh!

chaithanyaramkumar commented Jul 13, 2018

Uh oh!

nitinkhosla79 commented Apr 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

komawar commented Apr 2, 2020

Uh oh!

komawar commented Apr 6, 2020

Uh oh!

VictorBancho commented Jul 24, 2023

Uh oh!

chaithanyaramkumar commented Jun 20, 2018 •

edited

Loading

madphysicist commented Jun 27, 2018 •

edited

Loading

nitinkhosla79 commented Apr 2, 2020 •

edited

Loading