ToPan (multilingual topic modelling for Greek, Latin, Arabic and other languages)

(Meletē)ToPān v.0.1
The name (Meletē)ToPān v.0.1 is based on the Greek principle μελέτη τὸ πᾶν which roughly translate to "take into care everything". I decided for the name because Topic-Modelling performs well on large amounts of logically structured chunks of texts and it helps selecting the interesting bits in a large corpus of text by technically having looked at everything. The butterfly in the logo is of the species Melete. The original photograph is by Didier Descouens and he has licensed it under CC BY-SA 4.0. I changed the image for the logo slightly. I'd strongly suggest to start with the original if you want to use it, but you can also use this now slightly modified logo under CC BY-SA 4.0 license as I am required to share it under the same license as the original image

ToPān is Topic-Modelling for everyone: from people without programming knowledge to people that want to build teaching and text-reuse tools and apps based on Topic-Modelling data without having to develop their own tool or having to majorly restructure their textual data. ToPān is made to be shared and used. That is why I tried to modularise ToPān in a way that in each step you could ingest your own data. It works best however, if you work your way from left to right: from "Data Input" to "LDA Tables" (please find more details under "Instructions"). ToPān works best with files that are structured according to the CTS/CITE architecture.

ToPān is also still under active development. This is an alpha release. More features will be added and you are encouraged to roadtest ToPān and send me feedback or report bugs.

Catullus fix stop word bug, create sample datasets 11 days ago

Models fix stop word bug, create sample datasets 11 days ago

www fix stop word bug, create sample datasets 11 days ago

.gitignore Create .gitignore 4 months ago

Catullus.R fix stop word bug, create sample datasets 11 days ago

LICENSE Create LICENSE 3 months ago

Petronius.csv recovery 4 months ago

README.md Update README.md 3 months ago

Sandbox2.RData experimenting for switch from RCurl to httr 3 months ago

Sandbox2.Rhistory experimenting for switch from RCurl to httr 3 months ago

StemDic.rds major updates and changes 3 months ago

WordEmbedVec.R fix stop word bug, create sample datasets 11 days ago

app.R fix stop word bug, create sample datasets 11 days ago

caesar.csv fix stop word bug, create sample datasets 11 days ago

catullus.csv fix stop word bug, create sample datasets 11 days ago

copyright.md update description 3 months ago

corpus.rds update 4 months ago

dataentry.md update description 3 months ago

home.md Update home.md 3 months ago

message-handler.js major updates and changes 3 months ago

morphologicalnormalisation.md update description 3 months ago

phi0972.phi001Parsed.82xf implement 82XF 3 months ago

preliminary.md update description 3 months ago

sandbox.R update 4 months ago

sandbox2.R implement 82XF 3 months ago

settingtmvalues.md update description 3 months ago

temp_vectors.bin fix stop word bug, create sample datasets 11 days ago

treebank.xml update 4 months ago

understandingresults.md adding instructions 3 months ago

AWOL - The Ancient World Online

Sunday, October 16, 2016

ToPan (multilingual topic modelling for Greek, Latin, Arabic and other languages)

ToPan (multilingual topic modelling for Greek, Latin, Arabic and other languages)

(Meletē)ToPān v.0.1

No comments:

Post a Comment

Catullus	fix stop word bug, create sample datasets	11 days ago
Models	fix stop word bug, create sample datasets	11 days ago
www	fix stop word bug, create sample datasets	11 days ago
.gitignore	Create .gitignore	4 months ago
Catullus.R	fix stop word bug, create sample datasets	11 days ago
LICENSE	Create LICENSE	3 months ago
Petronius.csv	recovery	4 months ago
README.md	Update README.md	3 months ago
Sandbox2.RData	experimenting for switch from RCurl to httr	3 months ago
Sandbox2.Rhistory	experimenting for switch from RCurl to httr	3 months ago
StemDic.rds	major updates and changes	3 months ago
WordEmbedVec.R	fix stop word bug, create sample datasets	11 days ago
app.R	fix stop word bug, create sample datasets	11 days ago
caesar.csv	fix stop word bug, create sample datasets	11 days ago
catullus.csv	fix stop word bug, create sample datasets	11 days ago
copyright.md	update description	3 months ago
corpus.rds	update	4 months ago
dataentry.md	update description	3 months ago
home.md	Update home.md	3 months ago
message-handler.js	major updates and changes	3 months ago
morphologicalnormalisation.md	update description	3 months ago
phi0972.phi001Parsed.82xf	implement 82XF	3 months ago
preliminary.md	update description	3 months ago
sandbox.R	update	4 months ago
sandbox2.R	implement 82XF	3 months ago
settingtmvalues.md	update description	3 months ago
temp_vectors.bin	fix stop word bug, create sample datasets	11 days ago
treebank.xml	update	4 months ago
understandingresults.md	adding instructions	3 months ago