亚洲十八**毛片_亚洲综合影院_五月天精品一区二区三区_久久久噜噜噜久久中文字幕色伊伊 _欧美岛国在线观看_久久国产精品毛片_欧美va在线观看_成人黄网大全在线观看_日韩精品一区二区三区中文_亚洲一二三四区不卡

CS5012代做、代寫Python設(shè)計(jì)程序

時(shí)間:2024-03-03  來源:  作者: 我要糾錯(cuò)



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標(biāo)簽:

掃一掃在手機(jī)打開當(dāng)前頁
  • 上一篇:代做CS252編程、代寫C++設(shè)計(jì)程序
  • 下一篇:AcF633代做、Python設(shè)計(jì)編程代寫
  • 無相關(guān)信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國(guó)家級(jí)風(fēng)景名勝區(qū)
    昆明西山國(guó)家級(jí)風(fēng)景名勝區(qū)
    昆明旅游索道攻略
    昆明旅游索道攻略
  • 短信驗(yàn)證碼平臺(tái) 理財(cái) WPS下載

    關(guān)于我們 | 打賞支持 | 廣告服務(wù) | 聯(lián)系我們 | 網(wǎng)站地圖 | 免責(zé)聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網(wǎng) 版權(quán)所有
    ICP備06013414號(hào)-3 公安備 42010502001045

    激情图片小说一区| 日韩综合一区二区| 香蕉视频在线免费看| 欧美.日韩.国产.一区.二区| 6080成人| 国产国产一区| 在线不卡日本v二区707| 污视频网站在线| 免费一级大片| www.4438全国最大| 日本wwwwww| 日韩精品中文字幕一区| 偷拍日韩校园综合在线| 亚洲一区二区在线免费看| 不卡的看片网站| 久久国产综合精品| 麻豆久久一区二区| 国产一区激情在线| 粉嫩久久99精品久久久久久夜| 麻豆精品一区二区综合av| 国产欧美精品一区二区三区四区 | 欧美日韩欧美| 欧美孕妇孕交| 天堂在线免费av| 黄色国产在线| 91网址在线观看| 中文av在线全新| 伊人发布在线| 牛牛热在线视频| 一区二区三区| 精品久久对白| 亚洲春色h网| 久久精品青草| 久久婷婷av| 国产一区二区三区视频在线播放| 美日韩一区二区三区| 亚洲一区av在线| 中文字幕在线网| av资源网在线观看| 在线观看v片| 国内精品久久久久久久97牛牛 | 国产成人亚洲综合a∨婷婷| 国产不卡视频在线观看| 在线观看欧美日本| 精品国产凹凸成av人导航| 蜜桃视频网站www| 色综合一区二区日本韩国亚洲 | 国内精品久久久久久久97牛牛| 国产人成一区二区三区影院| 成人网18免费软件大全| 九色视频在线播放| 久久综合色占| 欧美专区18| 成人网在线免费视频| 久久精品一级爱片| 色综合色综合色综合| 日日摸日日添日日躁av| 国产在线观看黄| 天天免费亚洲黑人免费| 久久资源综合| 新67194成人永久网站| 精品福利一区二区| h片在线观看网站| 欧美久久成人| 久久这里只有精品视频网| 五月婷婷色综合| a天堂中文在线官网在线| 影音先锋日韩资源| 久久精品人人做人人爽97 | 午夜国产精品视频| 红桃视频成人在线观看| 美女精品导航| 网友自拍区视频精品| 国产日韩综合av| 黄色av免费在线观看| 成人在线免费视频观看| 国产精品一二三四区| 亚洲一区二区影院| 成人在线播放免费观看| 久久夜色精品国产噜噜av小说| 337p粉嫩大胆噜噜噜噜噜91av | 欧美高清你懂得| av资源网在线观看| 99精品国产福利在线观看免费 | 欧美日韩在线影院| av福利导福航大全在线播放| 国产精品一线天粉嫩av| 亚洲午夜影视影院在线观看| 嫩草嫩草嫩草嫩草| 日本不卡高清| 欧美色视频在线| 91caoporm在线视频| 国产欧美综合一区二区三区| 国产精品全国免费观看高清| 男女免费网站| 精品国产一区一区二区三亚瑟| 国产精品69毛片高清亚洲| 人成网站免费观看| 9999精品| 久久www免费人成看片高清| 国产精品午夜久久久久久| 久久亚洲人体| 国产精品国产三级国产aⅴ无密码| 成人综合av| 色琪琪久久se色| 最新日韩av在线| 先锋av资源网| 精品国产91| 欧美精品v国产精品v日韩精品| 欧美一区 二区| 国产精品原创巨作av| 青梅竹马是消防员在线| 久久久久国产精品一区二区| 毛片在线网址播放| 日韩动漫一区| 日韩欧美有码在线| 免费观看成人www动漫视频| 欧美午夜片在线免费观看| 精品成人自拍视频| 欧美日韩一级片在线观看| 新片速递亚洲合集欧美合集| 中文字幕免费不卡| 欧美一区二区少妇| 麻豆91在线播放免费| 1024国产在线| 91视频免费看| 在线宅男视频| av在线不卡顿| 亚洲午夜久久久久中文字幕久| 婷婷久久综合九色综合99蜜桃| 亚洲午夜精品17c| 加勒比中文字幕精品| 5月丁香婷婷综合| 一区二区三区国产精华| 在线观看成人免费视频| 国产一区二区三区四区五区| 日韩欧美的一区| 久久狠狠久久| 91精品国产一区二区三区蜜臀| 亚洲伦理影院| 精品国产91乱高清在线观看| 国产精品扒开腿做爽爽爽视频软件| 日韩理论片中文av| 精品福利一区| 可以免费看污视频的网站| 久久这里有精品15一区二区三区| 国产二区在线播放| 国产精品老牛| 麻豆网在线观看| 全国精品久久少妇| av成人网在线| 久久aⅴ国产欧美74aaa| sm在线观看| 久久人人97超碰com| 毛片免费在线观看| 91色婷婷久久久久合中文| 99久久这里有精品| 8x福利精品第一导航| 亚洲伊人网站| 992tv在线观看免费进| 国产成人综合亚洲网站| 久久xxx视频| 最新不卡av在线| 奇米亚洲欧美| 天堂a中文在线| 国产欧美综合色| 少妇精品久久久一区二区三区| 欧美h版电影| 91亚洲精品久久久蜜桃| 99a精品视频在线观看| 交换国产精品视频一区| 本田岬高潮一区二区三区| 北条麻妃在线一区二区免费播放 | 福利在线一区| 2023欧美最顶级a∨艳星| 久久久久久久久伊人| 你懂的一区二区三区| 欧美在线|欧美| 色88888久久久久久影院| 日韩黄色动漫| 一区在线观看免费| 超碰国产精品一区二页| wwwav91com| 国产欧美精品一区aⅴ影院 | 老司机精品视频网| 精品久久国产字幕高潮| 成人av电影在线播放| 欧美美女在线| 羞羞视频在线免费国产| 欧美日韩一卡二卡三卡 | 亚洲免费大片| 国产一区二区三区影视| 区一区二日本| 亚洲午夜在线电影| 日韩av一级片| 欧美成人视屏| 欧美日韩在线一区二区| 不卡欧美aaaaa| 99久久久久久中文字幕一区| 久草在线资源视频|