Change text here and there
Browse files
app.py
CHANGED
|
@@ -223,6 +223,24 @@ def highlight_entities():
|
|
| 223 |
return HTML_WRAPPER.format(soup)
|
| 224 |
|
| 225 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 226 |
def render_dependency_parsing(text: Dict):
|
| 227 |
html = render_sentence_custom(text, nlp)
|
| 228 |
html = html.replace("\n\n", "\n")
|
|
@@ -433,7 +451,7 @@ if summarize_button:
|
|
| 433 |
# DEPENDENCY PARSING PART
|
| 434 |
st.header("2️⃣ Dependency comparison")
|
| 435 |
st.markdown(
|
| 436 |
-
"The second method we use for post-processing is called **Dependency
|
| 437 |
"grammatical structure in a sentence is analysed, to find out related words as well as the type of the "
|
| 438 |
"relationship between them. For the sentence “Jan’s wife is called Sarah” you would get the following "
|
| 439 |
"dependency graph:")
|
|
@@ -455,7 +473,7 @@ if summarize_button:
|
|
| 455 |
"dependencies between article and summary (as we did with entity matching) would not be a robust method."
|
| 456 |
" More on the different sorts of dependencies and their description can be found [here](https://universaldependencies.org/docs/en/dep/).")
|
| 457 |
st.markdown("However, we have found that **there are specific dependencies that are often an "
|
| 458 |
-
"indication of a wrongly constructed sentence**
|
| 459 |
"common dependencies which - when present in the summary but not in the article - are highly "
|
| 460 |
"indicative of factualness errors. "
|
| 461 |
"Furthermore, we only check dependencies between an existing **entity** and its direct connections. "
|
|
@@ -489,16 +507,18 @@ if summarize_button:
|
|
| 489 |
"empirically tested they are definitely not sufficiently robust for general use-cases.")
|
| 490 |
st.markdown("####")
|
| 491 |
st.markdown(
|
| 492 |
-
"Below we generate 3 different kind of summaries, and based on the two discussed methods, their errors are "
|
| 493 |
-
"detected to estimate a
|
| 494 |
"the best summary (read: the one that a human would prefer or indicate as the best one) "
|
| 495 |
-
"will hopefully be at the top.
|
| 496 |
"only do this for the example articles (for which the different summmaries are already generated). The reason "
|
| 497 |
-
"for this is that HuggingFace spaces are limited in their CPU memory."
|
|
|
|
|
|
|
| 498 |
st.markdown("####")
|
| 499 |
|
| 500 |
if selected_article != "Provide your own input" and article_text == fetch_article_contents(selected_article):
|
| 501 |
-
with st.spinner("
|
| 502 |
summaries_list = []
|
| 503 |
deduction_points = []
|
| 504 |
|
|
@@ -524,7 +544,8 @@ if summarize_button:
|
|
| 524 |
cur_rank = 1
|
| 525 |
rank_downgrade = 0
|
| 526 |
for i in range(len(deduction_points)):
|
| 527 |
-
st.write(f'🏆 Rank {cur_rank} summary: 🏆', display_summary(summaries_list[i]), unsafe_allow_html=True)
|
|
|
|
| 528 |
if i < len(deduction_points) - 1:
|
| 529 |
rank_downgrade += 1
|
| 530 |
if not deduction_points[i + 1] == deduction_points[i]:
|
|
|
|
| 223 |
return HTML_WRAPPER.format(soup)
|
| 224 |
|
| 225 |
|
| 226 |
+
def highlight_entities_new(summary_str: str):
|
| 227 |
+
st.session_state.summary_output = summary_str
|
| 228 |
+
summary_content = st.session_state.summary_output
|
| 229 |
+
markdown_start_red = "<mark class=\"entity\" style=\"background: rgb(238, 135, 135);\">"
|
| 230 |
+
markdown_start_green = "<mark class=\"entity\" style=\"background: rgb(121, 236, 121);\">"
|
| 231 |
+
markdown_end = "</mark>"
|
| 232 |
+
|
| 233 |
+
matched_entities, unmatched_entities = get_and_compare_entities(False)
|
| 234 |
+
|
| 235 |
+
for entity in matched_entities:
|
| 236 |
+
summary_content = summary_content.replace(entity, markdown_start_green + entity + markdown_end)
|
| 237 |
+
|
| 238 |
+
for entity in unmatched_entities:
|
| 239 |
+
summary_content = summary_content.replace(entity, markdown_start_red + entity + markdown_end)
|
| 240 |
+
soup = BeautifulSoup(summary_content, features="html.parser")
|
| 241 |
+
return HTML_WRAPPER.format(soup)
|
| 242 |
+
|
| 243 |
+
|
| 244 |
def render_dependency_parsing(text: Dict):
|
| 245 |
html = render_sentence_custom(text, nlp)
|
| 246 |
html = html.replace("\n\n", "\n")
|
|
|
|
| 451 |
# DEPENDENCY PARSING PART
|
| 452 |
st.header("2️⃣ Dependency comparison")
|
| 453 |
st.markdown(
|
| 454 |
+
"The second method we use for post-processing is called **Dependency Parsing**: the process in which the "
|
| 455 |
"grammatical structure in a sentence is analysed, to find out related words as well as the type of the "
|
| 456 |
"relationship between them. For the sentence “Jan’s wife is called Sarah” you would get the following "
|
| 457 |
"dependency graph:")
|
|
|
|
| 473 |
"dependencies between article and summary (as we did with entity matching) would not be a robust method."
|
| 474 |
" More on the different sorts of dependencies and their description can be found [here](https://universaldependencies.org/docs/en/dep/).")
|
| 475 |
st.markdown("However, we have found that **there are specific dependencies that are often an "
|
| 476 |
+
"indication of a wrongly constructed sentence** when there is no article match. We (currently) use 2 "
|
| 477 |
"common dependencies which - when present in the summary but not in the article - are highly "
|
| 478 |
"indicative of factualness errors. "
|
| 479 |
"Furthermore, we only check dependencies between an existing **entity** and its direct connections. "
|
|
|
|
| 507 |
"empirically tested they are definitely not sufficiently robust for general use-cases.")
|
| 508 |
st.markdown("####")
|
| 509 |
st.markdown(
|
| 510 |
+
"*Below we generate 3 different kind of summaries, and based on the two discussed methods, their errors are "
|
| 511 |
+
"detected to estimate a summary score. Based on this basic approach, "
|
| 512 |
"the best summary (read: the one that a human would prefer or indicate as the best one) "
|
| 513 |
+
"will hopefully be at the top. We currently "
|
| 514 |
"only do this for the example articles (for which the different summmaries are already generated). The reason "
|
| 515 |
+
"for this is that HuggingFace spaces are limited in their CPU memory. We also highlight the entities as done "
|
| 516 |
+
"before, but note that the rankings are done on a combination of unmatched entities and "
|
| 517 |
+
"dependencies (with the latter not shown here).*")
|
| 518 |
st.markdown("####")
|
| 519 |
|
| 520 |
if selected_article != "Provide your own input" and article_text == fetch_article_contents(selected_article):
|
| 521 |
+
with st.spinner("Fetching summaries, ranking them and highlighting entities, this might take a minute or two..."):
|
| 522 |
summaries_list = []
|
| 523 |
deduction_points = []
|
| 524 |
|
|
|
|
| 544 |
cur_rank = 1
|
| 545 |
rank_downgrade = 0
|
| 546 |
for i in range(len(deduction_points)):
|
| 547 |
+
#st.write(f'🏆 Rank {cur_rank} summary: 🏆', display_summary(summaries_list[i]), unsafe_allow_html=True)
|
| 548 |
+
st.write(f'🏆 Rank {cur_rank} summary: 🏆', highlight_entities_new(summaries_list[i]), unsafe_allow_html=True)
|
| 549 |
if i < len(deduction_points) - 1:
|
| 550 |
rank_downgrade += 1
|
| 551 |
if not deduction_points[i + 1] == deduction_points[i]:
|