javascript - Splitting paragraph into sentences -
i'm using following python code (which found online while ago) split paragraphs sentences.
def splitparagraphintosentences(paragraph): import re sentenceenders = re.compile(r""" # split sentences on whitespace between them. (?: # group 2 positive lookbehinds. (?<=[.!?]) # either end of sentence punct, | (?<=[.!?]['"]) # or end of sentence punct , quote. ) # end group of 2 positive lookbehinds. (?<! mr\. ) # don't end sentence on "mr." (?<! mrs\. ) # don't end sentence on "mrs." (?<! jr\. ) # don't end sentence on "jr." (?<! dr\. ) # don't end sentence on "dr." (?<! prof\. ) # don't end sentence on "prof." (?<! sr\. ) # don't end sentence on "sr."." \s+ # split on whitespace between sentences. """, re.ignorecase | re.verbose) sentencelist = sentenceenders.split(paragraph) return sentencelist
i works fine purpose, need exact same regex in javascript (to make sure outputs consistent) , i'm struggling translate python regex 1 compatible javascript.
it not regex direct split, kind of workaround:
(?!mrs?\.|jr\.|dr\.|sr\.|prof\.)(\b\s+[.?!]["']?)\s
you can replace matched fragment example: $1#
(or other char not occuring in text, instead of #
), , split #
demo. not elegant solution.
Comments
Post a Comment