Every number on this dashboard traces back to one of three things: a SQL aggregation over the source xlsx data, a leakage-component formula with explicit constants, or a trained AI model called via the FastAPI service. This page lays out all three so reviewers can audit and challenge the methodology.
Seven xlsx files from the AP State Excise dataset
retailers4,899 active retail outlets — district, geo, statusbrands1,457 registered brands with strength, MRPlabels1,373 approved labels for FY 2025-26 (Lock 3 reference)sales_yearwiseFY rollups · primary revenue sourcesales_inflowDaily inflow ledger to retailersdistillery_quota42 units · allotted vs lifted PL (Lock 1 quota gap)seizuresEnforcement cases since 2019 (counterfeit signal)Editable in code · exposed for sensitivity analysis
| Bottles per PL | 6.66 | Unit conversion (proof-litre → 750ml bottles) |
| Avg realized price/bottle | ₹145.40 | Distillery quota → revenue |
| Border zone radius | 50 km | Retailers within this distance flagged |
| Border capture % | 0.80% | Of trailing revenue (interstate differential) |
| Anomaly capture % | 1.50% | Sales outlier-bound (replaceable by Lock 1) |
| Avg case value | ₹1500 | Counterfeit signal multiplier per seizure |
Four components sum to ₹876.7 Cr / yr (2.97% of baseline)
Replaced in Lock 1 by a per-retailer IsolationForest score — quota gap stays as the ground-truth ceiling.
1.5% of retail revenue treated as outlier-bound. Refined by Lock 1's compliance classifier (currently 96.6% test accuracy).
Refined by Lock 2 corridor deviations (current OCSVM recall: 85.7% on injected GPS deviations).
Lower bound — represents only what enforcement caught. Lock 3 (CNN) shifts this from reactive to predictive once deployed.
What we trained, on what features, and how we measured it
IsolationForest + GradientBoostingClassifierTrained on per-retailer monthly velocity. The IF anomaly score is one of 8 features feeding the GB classifier.
OneClassSVM (RBF kernel, ν=0.05)GPS data synthesized at the same shape we'd ingest in production (lat/lon polylines + features). Replace synth with real telematics for deployment.
mobilenet_v3_smallArchitecture and pipeline are production-ready. Retraining on photographed real holograms is a 1-day exercise once those images are available.
What is real data, what is synthesized, and what should change for production