IOErr counter stuck

Context: we are using `qfsadmin -s $ip -p $port ping` to collect metrics from our QFS cluster. 
One metric we use is the `ncorrupt` counter. When it's not 0, we get an alert to check the disk of the particular chunkserver.

E.g. `s=REDUCTED, p=REDUCTED, rack=REDUCTED, used=28464644933285, free=23535571826431, total=53541442322432, util=56.04, nblocks=437542, lastheard=0, ncorrupt=65, nchunksToMove=0, numDrives=6, numWritableDrives=6, overloaded=0, numReplications=0, numReadReplications=0, good=1, nevacuate=0, bytesevacuate=0, nlost=0, nwrites=40, load=0, md5sum=a95d6ff5740cb73bd29d8330233c40ff, replay=0, connected=1, stopped=0, chunks=437552, tiers=10:1:19:1482:2.37e+12:3.94e+12:39.76;15:5:23:436070:2.12e+13:4.96e+13:57.34, lostChunkDirs=`

Our problem is that the `ncorrupt` counter doesn't reset to 0 when the disk issue is fixed, until we restart the corresponding chunkserver. If we don't restart the chunkserver, the `ncorrupt` counter stays the same.
Is this a feature or a bug?

If this is intended, we'll need to resolve it on our end, but I though it's worth a shot asking.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IOErr counter stuck #239

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IOErr counter stuck #239

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions