twitter / communitynotesFile Size

Intro

File size measurements show the distribution of size of files.
Files are classified in four categories based on their size (lines of code): 1-100 (very small files), 101-200 (small files), 201-500 (medium size files), 501-1000 (long files), 1001+(very long files).
It is a good practice to keep files small. Long files may become "bloaters", code that have increased to such gargantuan proportions that they are hard to work with.

Learn more...

File Size Overall

There are 42 files with 10,951 lines of code.

1 very long files (1,489 lines of code)
5 long files (3,551 lines of code)
11 medium size files (3,902 lines of codeclsfd_ftr_w_mp_ins)
9 small files (1,347 lines of code)
16 very small files (662 lines of code)

Legend:

1001+

501-1000

201-500

101-200

1-100

explore: grouped by folders | grouped by size | sunburst | 3D view

File Size per Extension

1001+

501-1000

201-500

101-200

1-100

File Size per Logical Decomposition

primary

1001+

501-1000

201-500

101-200

1-100

Longest Files (Top 42)

File	# lines	# units
run_scoring.py in sourcecode/scoring	1489	25
mf_base_scorer.py in sourcecode/scoring	923	19
constants.py in sourcecode/scoring	827	5
scoring_rules.py in sourcecode/scoring	702	38
process_data.py in sourcecode/scoring	564	26
pflip_model.py in sourcecode/scoring	535	22
pandas_utils.py in sourcecode/scoring	479	21
matrix_factorization.py in sourcecode/scoring/matrix_factorization	467	17
contributor_state.py in sourcecode/scoring	442	17
reputation_matrix_factorization.py in sourcecode/scoring/reputation_matrix_factorization	439	16
note_ratings.py in sourcecode/scoring	412	8
post_selection_similarity_old.py in sourcecode/scoring	389	20
scorer.py in sourcecode/scoring	325	20
pseudo_raters.py in sourcecode/scoring/matrix_factorization	286	12
runner.py in sourcecode/scoring	247	3
post_selection_similarity.py in sourcecode/scoring	215	9
note_status_history.py in sourcecode/scoring	201	7
mf_group_scorer.py in sourcecode/scoring	177	13
mf_topic_scorer.py in sourcecode/scoring	173	10
topic_model.py in sourcecode/scoring	167	9
diligence_model.py in sourcecode/scoring/reputation_matrix_factorization	167	4
helpfulness_scores.py in sourcecode/scoring	158	4
reputation_scorer.py in sourcecode/scoring	136	12
helpfulness_model.py in sourcecode/scoring/reputation_matrix_factorization	129	3
normalized_loss.py in sourcecode/scoring/matrix_factorization	120	6
incorrect_filter.py in sourcecode/scoring	120	4
tag_consensus.py in sourcecode/scoring	94	2
mf_expansion_scorer.py in sourcecode/scoring	79	9
explanation_tags.py in sourcecode/scoring	76	3
mf_expansion_plus_scorer.py in sourcecode/scoring	73	9
mf_core_scorer.py in sourcecode/scoring	64	6
weighted_loss.py in sourcecode/scoring/reputation_matrix_factorization	61	4
tag_filter.py in sourcecode/scoring	59	5
model.py in sourcecode/scoring/matrix_factorization	52	5
dataset.py in sourcecode/scoring/reputation_matrix_factorization	41	1
mf_multi_group_scorer.py in sourcecode/scoring	31	4
enums.py in sourcecode/scoring	23	1
main.py in sourcecode	5	-
__init__.py in sourcecode	1	-
__init__.py in sourcecode/scoring/matrix_factorization	1	-
__init__.py in sourcecode/scoring	1	-
__init__.py in sourcecode/scoring/reputation_matrix_factorization	1	-

Files With Most Units (Top 37)

File	# lines	# units
scoring_rules.py in sourcecode/scoring	702	38
process_data.py in sourcecode/scoring	564	26
run_scoring.py in sourcecode/scoring	1489	25
pflip_model.py in sourcecode/scoring	535	22
pandas_utils.py in sourcecode/scoring	479	21
post_selection_similarity_old.py in sourcecode/scoring	389	20
scorer.py in sourcecode/scoring	325	20
mf_base_scorer.py in sourcecode/scoring	923	19
matrix_factorization.py in sourcecode/scoring/matrix_factorization	467	17
contributor_state.py in sourcecode/scoring	442	17
reputation_matrix_factorization.py in sourcecode/scoring/reputation_matrix_factorization	439	16
mf_group_scorer.py in sourcecode/scoring	177	13
pseudo_raters.py in sourcecode/scoring/matrix_factorization	286	12
reputation_scorer.py in sourcecode/scoring	136	12
mf_topic_scorer.py in sourcecode/scoring	173	10
mf_expansion_scorer.py in sourcecode/scoring	79	9
mf_expansion_plus_scorer.py in sourcecode/scoring	73	9
post_selection_similarity.py in sourcecode/scoring	215	9
topic_model.py in sourcecode/scoring	167	9
note_ratings.py in sourcecode/scoring	412	8
note_status_history.py in sourcecode/scoring	201	7
mf_core_scorer.py in sourcecode/scoring	64	6
normalized_loss.py in sourcecode/scoring/matrix_factorization	120	6
model.py in sourcecode/scoring/matrix_factorization	52	5
constants.py in sourcecode/scoring	827	5
tag_filter.py in sourcecode/scoring	59	5
mf_multi_group_scorer.py in sourcecode/scoring	31	4
incorrect_filter.py in sourcecode/scoring	120	4
helpfulness_scores.py in sourcecode/scoring	158	4
weighted_loss.py in sourcecode/scoring/reputation_matrix_factorization	61	4
diligence_model.py in sourcecode/scoring/reputation_matrix_factorization	167	4
runner.py in sourcecode/scoring	247	3
explanation_tags.py in sourcecode/scoring	76	3
helpfulness_model.py in sourcecode/scoring/reputation_matrix_factorization	129	3
tag_consensus.py in sourcecode/scoring	94	2
enums.py in sourcecode/scoring	23	1
dataset.py in sourcecode/scoring/reputation_matrix_factorization	41	1

Files With Long Lines (Top 7)

There are 7 files with lines longer than 120 characters. In total, there are 31 long lines.

File	# lines	# units	# long lines
process_data.py in sourcecode/scoring	564	26	15
note_status_history.py in sourcecode/scoring	201	7	5
contributor_state.py in sourcecode/scoring	442	17	4
mf_base_scorer.py in sourcecode/scoring	923	19	2
run_scoring.py in sourcecode/scoring	1489	25	2
note_ratings.py in sourcecode/scoring	412	8	2
helpfulness_scores.py in sourcecode/scoring	158	4	1

Correlations

File Size vs. Commits (all time): 42 points

		1489.0	lines of code min: 1.0 average: 260.74 25th percentile: 60.5 median: 162.5 75th percentile: 418.75 max: 1489.0
0	74.0
commits (all time) min: 1.0 \| average: 21.4 \| 25th percentile: 7.5 \| median: 16.0 \| 75th percentile: 29.25 \| max: 74.0

File Size vs. Contributors (all time): 42 points

		1489.0	lines of code min: 1.0 average: 260.74 25th percentile: 60.5 median: 162.5 75th percentile: 418.75 max: 1489.0
0	8.0
contributors (all time) min: 1.0 \| average: 3.57 \| 25th percentile: 2.75 \| median: 3.0 \| 75th percentile: 5.0 \| max: 8.0

File Size vs. Commits (30 days): 0 points

No data for "commits (30d)" vs. "lines of code".

File Size vs. Contributors (30 days): 0 points

No data for "contributors (30d)" vs. "lines of code".

File Size vs. Commits (90 days): 4 points

		1489.0	lines of code min: 479.0 average: 874.25 25th percentile: 534.75 median: 764.5 75th percentile: 1323.5 max: 1489.0
0	2.0
commits (90d) min: 2.0 \| average: 2.0 \| 25th percentile: 2.0 \| median: 2.0 \| 75th percentile: 2.0 \| max: 2.0

File Size vs. Contributors (90 days): 4 points

		1489.0	lines of code min: 479.0 average: 874.25 25th percentile: 534.75 median: 764.5 75th percentile: 1323.5 max: 1489.0
0	2.0
contributors (90d) min: 2.0 \| average: 2.0 \| 25th percentile: 2.0 \| median: 2.0 \| 75th percentile: 2.0 \| max: 2.0