1 | <warner> use the 30 hits from find src contrib misc -type f |grep -v pyc |
---|
2 | |xargs grep '\bget_stats\b'| grep -v 'def get_stats' |
---|
3 | <warner> let's walk through them |
---|
4 | |
---|
5 | <warner> 6 in stats.py [16:29] |
---|
6 | <warner> (I use \b so we'll match things like callRemote("get_stats") and |
---|
7 | x=blah.get_stats; x() , not that anyone does the latter) |
---|
8 | <arc`> It looks like all the references without dot prefixes are either |
---|
9 | comments or definitions or string literals containing get_stats [16:30] |
---|
10 | <warner> yeah. the string literal form is what invokes |
---|
11 | RIStatsProvider.get_stats [16:31] |
---|
12 | <warner> it's in a callRemote() argument |
---|
13 | <arc`> I'm not seeing those I'm seeing log.msg("foofooget_stats() -> foo") |
---|
14 | |
---|
15 | <warner> now, 5 hits are in StatsProvider, which is the internal |
---|
16 | aggregate-all-the-producers object that provides RIStatsProvider |
---|
17 | [16:32] |
---|
18 | <warner> oh, my \b probably skipped those |
---|
19 | <warner> the \b matches word boundaries |
---|
20 | <warner> so blahXYZ won't match \bXYZ |
---|
21 | |
---|
22 | <warner> another hit is in stats.StatsGatherer, which is the service that does |
---|
23 | callRemote('get_stats') and retrieves stats through the |
---|
24 | RIStatsProvider.get_stats interface |
---|
25 | <warner> that's the receiving end of the remote protocol |
---|
26 | <warner> er, the initiating end, but it's where the stats wind up |
---|
27 | <warner> from line 211, you can see that the stats get sent to self.got_stats |
---|
28 | [16:34] |
---|
29 | <warner> which, from the NotImplementedError on line 227, you can tell is |
---|
30 | meant to be overridden by subclasses, so we need to look for 'def |
---|
31 | got_stats' in subclasses. Fortunately, all the subclasses are in this |
---|
32 | same file, right after StatsGatherer, so we don't have to look far |
---|
33 | [16:35] |
---|
34 | <warner> StdOutStatsGatherer.got_stats just uses pprint() on them |
---|
35 | <warner> and PickleStatsGatherer.got_stats pickles them [16:36] |
---|
36 | <warner> it turns out that misc/operations_helpers/munin/tahoe_stats reads |
---|
37 | that pickle |
---|
38 | <warner> line 460 is where it extracts an individual statistic [16:38] |
---|
39 | <warner> and on line 461 it does 'if value is not None', so it can probably |
---|
40 | tolerate None |
---|
41 | <warner> also, the PLUGINS table that makes up most of the file doesn't look |
---|
42 | at latencies at all [16:39] |
---|
43 | <warner> ok, so that covers stats.py |
---|
44 | <warner> next up: web/check_results.py |
---|
45 | ERC> |
---|
46 | <warner> another hit is in stats.StatsGatherer, which is the service that does |
---|
47 | callRemote('get_stats') and retrieves stats through the |
---|
48 | RIStatsProvider.get_stats interface |
---|
49 | <warner> that's the receiving end of the remote protocol |
---|
50 | <warner> er, the initiating end, but it's where the stats wind up |
---|
51 | <warner> from line 211, you can see that the stats get sent to self.got_stats |
---|
52 | [16:34] |
---|
53 | <warner> which, from the NotImplementedError on line 227, you can tell is |
---|
54 | meant to be overridden by subclasses, so we need to look for 'def |
---|
55 | got_stats' in subclasses. Fortunately, all the subclasses are in this |
---|
56 | same file, right after StatsGatherer, so we don't have to look far |
---|
57 | [16:35] |
---|
58 | <warner> StdOutStatsGatherer.got_stats just uses pprint() on them |
---|
59 | <warner> and PickleStatsGatherer.got_stats pickles them [16:36] |
---|
60 | <warner> it turns out that misc/operations_helpers/munin/tahoe_stats reads |
---|
61 | that pickle |
---|
62 | <warner> line 460 is where it extracts an individual statistic [16:38] |
---|
63 | <warner> and on line 461 it does 'if value is not None', so it can probably |
---|
64 | tolerate None |
---|
65 | <warner> also, the PLUGINS table that makes up most of the file doesn't look |
---|
66 | at latencies at all [16:39] |
---|
67 | <warner> ok, so that covers stats.py |
---|
68 | <warner> next up: web/check_results.py |
---|
69 | <warner> that calls the get_stats() that's really ICheckResults.get_stats |
---|
70 | [16:40] |
---|
71 | * arc` yanking contents of channel for more leisurely review [16:41] |
---|
72 | <warner> root.py: lines 199 and 212. Both look at a few specific keys and |
---|
73 | ignore the rest, and don't look at latencies |
---|
74 | <warner> next up: web/status.py |
---|
75 | <warner> four hits [16:42] |
---|
76 | <warner> there's a self.helper.get_stats() in |
---|
77 | HelperStatus.data_helper_stats.. the helper doesn't have a storage |
---|
78 | server, so latencies shouldn't be there |
---|
79 | <warner> same for the second hit in HelperStatus.render_JSON |
---|
80 | <warner> third hit is Statistics.renderHTTP (the t=="json" clause): that looks |
---|
81 | like an external HTTP interface for fetching the data. JSON can |
---|
82 | handle None, but let's add "find external HTTP callers" to the list |
---|
83 | of things to investigate [16:43] |
---|
84 | <warner> fourth hit is Statistics.data_get_stats, which is used as input to |
---|
85 | the series of 'render_*' functions in the rest of that |
---|
86 | class. Fortunately, none of them look at latency, except render_raw() |
---|
87 | which just pprints() the whole thing, for which None is fine |
---|
88 | [16:44] |
---|
89 | <warner> ok, that's all of web/status.py |
---|
90 | ERC> |
---|
91 | [16:40] |
---|
92 | * arc` yanking contents of channel for more leisurely review [16:41] |
---|
93 | <warner> root.py: lines 199 and 212. Both look at a few specific keys and |
---|
94 | ignore the rest, and don't look at latencies |
---|
95 | <warner> next up: web/status.py |
---|
96 | <warner> four hits [16:42] |
---|
97 | <warner> there's a self.helper.get_stats() in |
---|
98 | HelperStatus.data_helper_stats.. the helper doesn't have a storage |
---|
99 | server, so latencies shouldn't be there |
---|
100 | <warner> same for the second hit in HelperStatus.render_JSON |
---|
101 | <warner> third hit is Statistics.renderHTTP (the t=="json" clause): that looks |
---|
102 | like an external HTTP interface for fetching the data. JSON can |
---|
103 | handle None, but let's add "find external HTTP callers" to the list |
---|
104 | of things to investigate [16:43] |
---|
105 | <warner> fourth hit is Statistics.data_get_stats, which is used as input to |
---|
106 | the series of 'render_*' functions in the rest of that |
---|
107 | class. Fortunately, none of them look at latency, except render_raw() |
---|
108 | which just pprints() the whole thing, for which None is fine |
---|
109 | [16:44] |
---|
110 | <warner> ok, that's all of web/status.py |
---|
111 | <warner> last one is web/storage.py [16:45] |
---|
112 | <warner> three hits |
---|
113 | <warner> first is StorageStatus.render_JSON: the output of |
---|
114 | self.storage.get_stats() is included in the JSON response |
---|
115 | <warner> so that's another "find external HTTP callers" item |
---|
116 | <warner> second hit is a command [16:46] |
---|
117 | <warner> last hit is in StorageStatus.data_stats, which (looking at the |
---|
118 | associated storage_status.xhtml template for n:data="stats" |
---|
119 | attributes) is used by a couple of local render_* functions |
---|
120 | [16:47] |
---|
121 | <warner> but, none of those touch latency |
---|
122 | *** kevan_ (~kevan@host-134-71-248-56.allocated.csupomona.edu) has quit: Quit: |
---|
123 | leaving |
---|
124 | ERC> OK, that was a bit faster than I could keep up with, . |
---|
125 | * arc` yanking contents of channel for more leisurely review [16:41] |
---|
126 | <warner> root.py: lines 199 and 212. Both look at a few specific keys and |
---|
127 | ignore the rest, and don't look at latencies |
---|
128 | <warner> next up: web/status.py |
---|
129 | <warner> four hits [16:42] |
---|
130 | <warner> there's a self.helper.get_stats() in |
---|
131 | HelperStatus.data_helper_stats.. the helper doesn't have a storage |
---|
132 | server, so latencies shouldn't be there |
---|
133 | <warner> same for the second hit in HelperStatus.render_JSON |
---|
134 | <warner> third hit is Statistics.renderHTTP (the t=="json" clause): that looks |
---|
135 | like an external HTTP interface for fetching the data. JSON can |
---|
136 | handle None, but let's add "find external HTTP callers" to the list |
---|
137 | of things to investigate [16:43] |
---|
138 | <warner> fourth hit is Statistics.data_get_stats, which is used as input to |
---|
139 | the series of 'render_*' functions in the rest of that |
---|
140 | class. Fortunately, none of them look at latency, except render_raw() |
---|
141 | which just pprints() the whole thing, for which None is fine |
---|
142 | [16:44] |
---|
143 | <warner> ok, that's all of web/status.py |
---|
144 | <warner> last one is web/storage.py [16:45] |
---|
145 | <warner> three hits |
---|
146 | <warner> first is StorageStatus.render_JSON: the output of |
---|
147 | self.storage.get_stats() is included in the JSON response |
---|
148 | <warner> so that's another "find external HTTP callers" item |
---|
149 | <warner> second hit is a command [16:46] |
---|
150 | <warner> last hit is in StorageStatus.data_stats, which (looking at the |
---|
151 | associated storage_status.xhtml template for n:data="stats" |
---|
152 | attributes) is used by a couple of local render_* functions |
---|
153 | [16:47] |
---|
154 | <warner> but, none of those touch latency |
---|
155 | *** kevan_ (~kevan@host-134-71-248-56.allocated.csupomona.edu) has quit: Quit: |
---|
156 | leaving |
---|
157 | <warner> so, we're down to HTTP callers of /storage?t=json |
---|
158 | (StorageStatus.render_JSON) and of /statistics?t=json [16:49] |
---|
159 | <warner> where we might be concerned that those callers will mishandle a None |
---|
160 | (really a null, since it's JSON) [16:50] |
---|
161 | <warner> at that point, I think we can stop, because I'm pretty sure there's |
---|
162 | no code in tahoe that calls HTTP and parses JSON for stats.. all |
---|
163 | tahoe's stats-consuming code is either internal, or uses foolscap |
---|
164 | [16:51] |
---|
165 | <zooko> I'm on Skype with Josh. |
---|
166 | <zooko> He had a emacs UI error and didn't realize you were still typing. :-) |
---|
167 | <warner> so, in summary: if the tests pass, then yes, changing the storage |
---|
168 | server latency-reporting code to return None for percentile buckets |
---|
169 | that are not yet full should be ok [16:52] |
---|
170 | <arc`> :-) I was in the buffer I was copying into... |
---|
171 | * arc` blushind |
---|
172 | * arc` blushing |
---|
173 | <warner> want me to pastebin anything? |
---|
174 | <arc`> I think I have it all, going to review now. Thanks! |
---|
175 | ERC> |
---|
176 | |
---|