twitter / twitter-korean-text
File Size

The distribution of size of files (measured in lines of code).

Intro
Learn more...
File Size Overall
0% | 0% | 29% | 32% | 38%
Legend:
1001+
501-1000
201-500
101-200
1-100


explore: grouped by folders | grouped by size | sunburst | 3D view
File Size per Extension
1001+
501-1000
201-500
101-200
1-100
java0% | 0% | 54% | 21% | 24%
scala0% | 0% | 15% | 39% | 45%
xml0% | 0% | 0% | 0% | 100%
sbt0% | 0% | 0% | 0% | 100%
File Size per Logical Decomposition
primary
1001+
501-1000
201-500
101-200
1-100
src0% | 0% | 29% | 32% | 37%
ROOT0% | 0% | 0% | 0% | 100%
Longest Files (Top 32)
File# lines# units
CharArrayMap.java
in src/main/java/com/twitter/penguin/korean/util
476 62
KoreanPhraseExtractor.scala
in src/main/scala/com/twitter/penguin/korean/phrase_extractor
238 11
CharacterUtils.java
in src/main/java/com/twitter/penguin/korean/util
186 21
KoreanConjugation.scala
in src/main/scala/com/twitter/penguin/korean/util
155 3
KoreanTokenizer.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
117 3
KoreanChunker.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
115 8
KoreanNormalizer.scala
in src/main/scala/com/twitter/penguin/korean/normalizer
111 6
KoreanDictionaryProvider.scala
in src/main/scala/com/twitter/penguin/korean/util
104 7
CharArraySet.java
in src/main/java/com/twitter/penguin/korean/util
87 15
TwitterKoreanProcessorJava.java
in src/main/java/com/twitter/penguin/korean
73 11
KoreanPos.scala
in src/main/scala/com/twitter/penguin/korean/util
72 2
KoreanSubstantive.scala
in src/main/scala/com/twitter/penguin/korean/util
68 5
ParsedChunk.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
64 5
Hangul.scala
in src/main/scala/com/twitter/penguin/korean/util
58 2
BatchTokenizeTweets.scala
in src/main/scala/com/twitter/penguin/korean/qa
58 2
KoreanStemmer.scala
in src/main/scala/com/twitter/penguin/korean/stemmer
51 1
DeduplicateAndSortDictionaries.scala
in src/main/scala/com/twitter/penguin/korean/tools
39 1
KoreanTokenJava.java
in src/main/java/com/twitter/penguin/korean
38 7
BatchGetUnknownNouns.scala
in src/main/scala/com/twitter/penguin/korean/qa
37 1
TwitterKoreanProcessor.scala
in src/main/scala/com/twitter/penguin/korean
36 6
KoreanDetokenizer.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
34 3
CreatePhraseExtractionExamples.scala
in src/main/scala/com/twitter/penguin/korean/tools
33 -
CreateParsingExamples.scala
in src/main/scala/com/twitter/penguin/korean/tools
30 -
CreateConjugationExamples.scala
in src/main/scala/com/twitter/penguin/korean/tools
27 1
TokenizerProfile.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
23 -
KoreanSentenceSplitter.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
22 1
UpdateAllTheExamples.scala
in src/main/scala/com/twitter/penguin/korean/tools
20 2
18 -
KoreanProcessorSandbox.scala
in src/main/scala/com/twitter/penguin/korean/qa
17 1
KoreanPosJava.java
in src/main/java/com/twitter/penguin/korean
12 -
build.sbt
in root
10 -
Runnable.scala
in src/main/scala/com/twitter/penguin/korean/tools
7 1
Files With Most Units (Top 26)
File# lines# units
CharArrayMap.java
in src/main/java/com/twitter/penguin/korean/util
476 62
CharacterUtils.java
in src/main/java/com/twitter/penguin/korean/util
186 21
CharArraySet.java
in src/main/java/com/twitter/penguin/korean/util
87 15
TwitterKoreanProcessorJava.java
in src/main/java/com/twitter/penguin/korean
73 11
KoreanPhraseExtractor.scala
in src/main/scala/com/twitter/penguin/korean/phrase_extractor
238 11
KoreanChunker.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
115 8
KoreanTokenJava.java
in src/main/java/com/twitter/penguin/korean
38 7
KoreanDictionaryProvider.scala
in src/main/scala/com/twitter/penguin/korean/util
104 7
TwitterKoreanProcessor.scala
in src/main/scala/com/twitter/penguin/korean
36 6
KoreanNormalizer.scala
in src/main/scala/com/twitter/penguin/korean/normalizer
111 6
KoreanSubstantive.scala
in src/main/scala/com/twitter/penguin/korean/util
68 5
ParsedChunk.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
64 5
KoreanConjugation.scala
in src/main/scala/com/twitter/penguin/korean/util
155 3
KoreanTokenizer.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
117 3
KoreanDetokenizer.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
34 3
UpdateAllTheExamples.scala
in src/main/scala/com/twitter/penguin/korean/tools
20 2
Hangul.scala
in src/main/scala/com/twitter/penguin/korean/util
58 2
KoreanPos.scala
in src/main/scala/com/twitter/penguin/korean/util
72 2
BatchTokenizeTweets.scala
in src/main/scala/com/twitter/penguin/korean/qa
58 2
CreateConjugationExamples.scala
in src/main/scala/com/twitter/penguin/korean/tools
27 1
DeduplicateAndSortDictionaries.scala
in src/main/scala/com/twitter/penguin/korean/tools
39 1
Runnable.scala
in src/main/scala/com/twitter/penguin/korean/tools
7 1
KoreanSentenceSplitter.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
22 1
KoreanProcessorSandbox.scala
in src/main/scala/com/twitter/penguin/korean/qa
17 1
BatchGetUnknownNouns.scala
in src/main/scala/com/twitter/penguin/korean/qa
37 1
KoreanStemmer.scala
in src/main/scala/com/twitter/penguin/korean/stemmer
51 1
Files With Long Lines (Top 4)

There are 4 files with lines longer than 120 characters. In total, there are 5 long lines.

File# lines# units# long lines
KoreanChunker.scala
in src/main/scala/com/twitter/penguin/korean/tokenizer
115 8 2
TwitterKoreanProcessorJava.java
in src/main/java/com/twitter/penguin/korean
73 11 1
KoreanConjugation.scala
in src/main/scala/com/twitter/penguin/korean/util
155 3 1
BatchTokenizeTweets.scala
in src/main/scala/com/twitter/penguin/korean/qa
58 2 1