3 \W@sdZddlmZddlmZddlmZddlZddlZddlmZddl m Z ej d Z d d Z Gd d d ZGdddeZGdddeZGdddeZGdddeZGdddeZGdddeZGdddeZGdddeZGdddeZGd d!d!eZdS)"a CORE MARKDOWN BLOCKPARSER =========================================================================== This parser handles basic parsing of Markdown blocks. It doesn't concern itself with inline elements such as **bold** or *italics*, but rather just catches blocks, lists, quotes, etc. The BlockParser is made up of a bunch of BlockProssors, each handling a different type of block. Extensions may add/replace/remove BlockProcessors as they need to alter how markdown blocks are parsed. )absolute_import)division)unicode_literalsN)util) BlockParserMARKDOWNcKst|}t||jd<t||jd<t||jd<t||jd<t||jd<t||jd<t||jd<t ||jd<t ||jd <t ||jd <|S) z2 Build the default block parser used by Markdown. emptyindentcodeZ hashheaderZ setextheaderhrZolistZulistquoteZ paragraph) rEmptyBlockProcessorblockprocessorsListIndentProcessorCodeBlockProcessorHashHeaderProcessorSetextHeaderProcessor HRProcessorOListProcessorUListProcessorBlockQuoteProcessorParagraphProcessor) md_instancekwargsparserr%build/lib/markdown/blockprocessors.pybuild_block_parsersrc@sBeZdZdZddZddZddZdd d Zd d Zd dZ dS)BlockProcessora Base class for block processors. Each subclass will provide the methods below to work with the source and tree. Each processor will need to define it's own ``test`` and ``run`` methods. The ``test`` method should return True or False, to indicate whether the current block should be processed by this processor. If the test passes, the parser will call the processors ``run`` method. cCs||_|jj|_dS)N)rmarkdown tab_length)selfrrrr__init__4szBlockProcessor.__init__cCst|r|dSdSdS)z, Return the last child of an etree element. rN)len)r"parentrrr lastChild8szBlockProcessor.lastChildcCsxg}|jd}xH|D]@}|jd|jr>|j||jdq|jsR|jdqPqWdj|dj|t|dfS)z= Remove a tab from the front of each line of the given text.   N)split startswithr!appendstripjoinr%)r"textZnewtextlineslinerrrdetab?s   zBlockProcessor.detabrcCs\|jd}xFtt|D]6}||jd|j|r|||j|d||<qWdj|S)z? Remove a tab from front of lines but allowing dedented lines. r(r)N)r+ranger%r,r!r/)r"r0levelr1irrr looseDetabLs  zBlockProcessor.looseDetabcCsdS)ay Test for block type. Must be overridden by subclasses. As the parser loops through processors, it will call the ``test`` method on each to determine if the given block of text is of that type. This method must return a boolean ``True`` or ``False``. The actual method of testing is left to the needs of that particular block type. It could be as simple as ``block.startswith(some_string)`` or a complex regular expression. As the block type may be different depending on the parent of the block (i.e. inside a list), the parent etree element is also provided and may be used as part of the test. Keywords: * ``parent``: A etree element which will be the parent of the block. * ``block``: A block of text from the source which has been split at blank lines. Nr)r"r&blockrrrtestTszBlockProcessor.testcCsdS)a Run processor. Must be overridden by subclasses. When the parser determines the appropriate type of a block, the parser will call the corresponding processor's ``run`` method. This method should parse the individual lines of the block and append them to the etree. Note that both the ``parent`` and ``etree`` keywords are pointers to instances of the objects which should be edited in place. Each processor must make changes to the existing objects as there is no mechanism to return new/different objects to replace them. This means that this method should be adding SubElements or adding text to the parent, and should remove (``pop``) or add (``insert``) items to the list of blocks. Keywords: * ``parent``: A etree element which is the parent of the current block. * ``blocks``: A list of all remaining blocks of the document. Nr)r"r&blocksrrrrunhszBlockProcessor.runN)r) __name__ __module__ __qualname____doc__r#r'r3r7r9r;rrrrr)s  rc@sFeZdZdZdgZddgZddZddZd d Zd d Z d dZ dS)rz Process children of list items. Example: * a list item process this part or this part liulolcGs&tj|f|tjd|j|_dS)Nz ^(([ ]{%s})+))rr#recompiler! INDENT_RE)r"argsrrrr#szListIndentProcessor.__init__cCsL|jd|joJ|jjjd oJ|j|jkpJt|oJ|doJ|dj|jkS)Nr)detabbedrr$r$) r,r!rstateisstatetag ITEM_TYPESr% LIST_TYPES)r"r&r8rrrr9s  zListIndentProcessor.testcCs&|jd}|j||\}}|j||}|jjjd|j|jkrt|rn|dj|j krn|jj |d|gn|jj ||gn|j|jkr|jj ||gnxt|o|dj|jkr |d j rt j jd}|d j |_ d|d _ |d jd||jj|d |n |j|||jjjdS)NrrGrpr*r$r$r$r$r$r$r$r$)pop get_levelr7rrHsetrJrKr%rL parseBlocksr0retreeElementinsert parseChunk create_itemreset)r"r&r:r8r5siblingrMrrrr;s&         zListIndentProcessor.runcCs"tjj|d}|jj||gdS)z< Create a new li and parse the block with it as the parent. r@N)rrR SubElementrrQ)r"r&r8r@rrrrVszListIndentProcessor.create_itemcCs|jj|}|r&t|jd|j}nd}|jjjdr>d}nd}xR||kr|j|}|dk r|j |j ksv|j |j kr|j |j kr|d7}|}qDPqDW||fS)z* Get level of indent based on list level. rrlistN) rEmatchr%groupr!rrHrIr'rJrLrK)r"r&r8m indent_levelr5ZchildrrrrOs     zListIndentProcessor.get_levelN) r<r=r>r?rKrLr#r9r;rVrOrrrrrs  $rc@s eZdZdZddZddZdS)rz Process code blocks. cCs|jd|jS)Nr))r,r!)r"r&r8rrrr9szCodeBlockProcessor.testcCs|j|}|jd}d}|dk rr|jdkrrt|rr|djdkrr|d}|j|\}}tjd|j|jf|_n>tj j |d}tj j |d}|j|\}}tjd|j|_|r|j d|dS)Nrr*prer z%s %s z%s ) r'rNrJr%r3r AtomicStringr0rstriprRrYrT)r"r&r:rXr8theRestr r_rrrr;s  zCodeBlockProcessor.runN)r<r=r>r?r9r;rrrrrsrc@s.eZdZejdZddZddZddZdS) rz(^|\n)[ ]{0,3}>[ ]?(.*)cCst|jj|S)N)boolREsearch)r"r&r8rrrr9szBlockQuoteProcessor.testcs|jd}jj|}|rd|d|j}jj||gdjfdd||jdjdD}j|}|dk r|j dkr|}nt j j |d}jj jdjj||jj jdS)Nrr(csg|]}j|qSr)clean).0r2)r"rr sz+BlockQuoteProcessor.run..Z blockquote)rNrdrestartrrQr/r+r'rJrrRrYrHrPrUrW)r"r&r:r8r]beforerXr r)r"rr;s   zBlockQuoteProcessor.runcCs2|jj|}|jdkrdS|r*|jdS|SdS)z( Remove ``>`` from beginning of a line. >r*N)rdr[r.r\)r"r2r]rrrrfs    zBlockQuoteProcessor.cleanN) r<r=r>rCrDrdr9r;rfrrrrrs rc@sVeZdZdZdZejdZejdZejdZ dZ ddgZ dd Z d d Z d d ZdS)rz Process ordered list blocks. rBz^[ ]{0,3}\d+\.[ ]+(.*)z ^[ ]{0,3}((\d+\.)|[*+-])[ ]+(.*)z^[ ]{4,7}((\d+\.)|[*+-])[ ]+.*1rAcCst|jj|S)N)rcrdr[)r"r&r8rrrr9:szOListProcessor.testc Cs|j|jd}|j|}|dk r|j|jkr|}|d jrntjjd}|dj|_d|d_|dj d||j|d}|dk r|j rtjj |dd}|j j |_d|_ tjj |d}|j jjd|jd} |j j|| g|j jjnH|jdkr|}n6tjj ||j}|j jj r:|jd kr:|j|jd <|j jjd xT|D]L} | jd |jrz|j j|d| gntjj |d}|j j|| gqNW|j jjdS)NrrrMr*r@Z looselistrBrArmrirZr)r$r$r$r$r$r$)rBrAr$) get_itemsrNr'rJ SIBLING_TAGSr0rrRrSrTtailrYlstriprrHrPrQrWTAGr lazy_ol STARTSWITHattribr,r!) r"r&r:itemsrXlstrMZlchr@Z firstitemitemrrrr;=s>          zOListProcessor.runcCsg}x|jdD]}|jj|}|rf| rT|jdkrTtjd}|j|jdj|_|j|jdq|j j|r|dj d|j rd|d |f|d <q|j|qd|d |f|d <qW|S) z Break a block into list items. r(rBz(\d+)rr)z%s %sr$r$r$r$r$) r+CHILD_REr[rrrCrDr\rtr-rEr,r!)r"r8rvr2r]Z INTEGER_RErrrrnxs    zOListProcessor.get_itemsN)r<r=r>r?rrrCrDrdrzrErtror9r;rnrrrrr(s   ;rc@seZdZdZdZejdZdS)rz Process unordered list blocks. rAz^[ ]{0,3}[*+-][ ]+(.*)N)r<r=r>r?rrrCrDrdrrrrrsrc@s*eZdZdZejdZddZddZdS)rz Process Hash Headers. z.(^|\n)(?P#{1,6})(?P
.*?)#*(\n|$)cCst|jj|S)N)rcrdre)r"r&r8rrrr9szHashHeaderProcessor.testcCs|jd}|jj|}|r|d|j}||jd}|rN|jj||gtjj |dt |j d}|j dj |_ |r|jd|ntjd|dS)Nrzh%dr5headerzWe've got a problem header: %r)rNrdreriendrrQrrRrYr%r\r.r0rTloggerwarn)r"r&r:r8r]rjafterhrrrr;s  zHashHeaderProcessor.runN) r<r=r>r?rCrDrdr9r;rrrrrs rc@s.eZdZdZejdejZddZddZ dS)rz Process Setext-style Headers. z^.*?\n[=-]+[ ]*(\n|$)cCst|jj|S)N)rcrdr[)r"r&r8rrrr9szSetextHeaderProcessor.testcCsr|jdjd}|djdr$d}nd}tjj|d|}|dj|_t|dkrn|j ddj |dddS)Nrr(r=rlzh%d) rNr+r,rrRrYr.r0r%rTr/)r"r&r:r1r5rrrrr;s zSetextHeaderProcessor.runN) r<r=r>r?rCrD MULTILINErdr9r;rrrrrsrc@s2eZdZdZdZejeejZddZ ddZ dS)rz Process Horizontal Rules. zB^[ ]{0,3}((-+[ ]{0,2}){3,}|(_+[ ]{0,2}){3,}|(\*+[ ]{0,2}){3,})[ ]*cCs>|jj|}|r:|jt|ks0||jdkr:||_dSdS)Nr(TF) SEARCH_RErer|r%r[)r"r&r8r]rrrr9s  $zHRProcessor.testcCsp|jd}|d|jjjd}|r6|jj||gtjj|d||jj dj d}|rl|j d|dS)Nrr(r ) rNr[rirarrQrrRrYr|rqrT)r"r&r:r8ZprelinesZ postlinesrrrr;s zHRProcessor.runN) r<r=r>r?rdrCrDrrr9r;rrrrrs  rc@s eZdZdZddZddZdS)rz< Process blocks that are empty or start with an empty line. cCs| p|jdS)Nr()r,)r"r&r8rrrr9szEmptyBlockProcessor.testcCs|jd}d}|r2d}|dd}|r2|jd||j|}|dk r|jdkrt|r|djdkrtjd|dj|f|d_dS)Nrz r(rr_r z%s%s)rNrTr'rJr%rr`r0)r"r&r:r8ZfillerrbrXrrrr;s    (zEmptyBlockProcessor.runN)r<r=r>r?r9r;rrrrrsrc@s eZdZdZddZddZdS)rz Process Paragraph blocks. cCsdS)NTr)r"r&r8rrrr9szParagraphProcessor.testcCs|jd}|jr|jjjdrz|j|}|dk rV|jrJd|j|f|_qxd||_q|jrnd|j|f|_q|j|_nt j j |d}|j|_dS)NrrZz%s %sz %srM) rNr.rrHrIr'rpr0rqrrRrY)r"r&r:r8rXrMrrrr;s    zParagraphProcessor.runN)r<r=r>r?r9r;rrrrr sr)r? __future__rrrloggingrCr*rZ blockparserr getLoggerr}rrrrrrrrrrrrrrrr s(      X`(k#