Sample demographics and farm typologies
The sample consisted of 400 bovine dairy farms located in four key Indian states, stratified by scale and farm type. Table 1 summarizes demographic data.
With mean years of education ranging from 8.7 in Uttar Pradesh to 11.5 in Punjab, men made up the majority of respondents (>93%). The average household had five or six people. In all states, dairying was the primary source of income, however mixed farming was more common in Maharashtra
(Potdar et al., 2020).
Cluster analysis of farm types
Cluster analysis revealed three distinct groupings across the whole sample (Table 2).
Punjab and Maharashtra were dominated by large, high-tech farms (C1), but smallholder and family module farms were more prevalent in UP and AP. The mean herd size varied significantly within clusters, with C1 animals averaging 16.2, C2 animals averaging 8.9 and C3 animals averaging 4.2. Fig 3 shows a cluster analysis silhouette plot for farm typologies that visually distinguishes and compares the three farm types found by cluster analysis.
The strong relationship between cluster membership and efficiency scores underlines the importance of farm characteristics and management systems. Cluster C1 (large, high-tech farms) achieved an average technical efficiency of 0.96 and economic efficiency of 0.89; Cluster C2 (medium, mixed-management farms) achieved an average technical efficiency of 0.84 and economic efficiency of 0.72; And cluster C3 (small farmers, traditional) farms achieved a technical efficiency of 0.63 and an economic efficiency of 0.48. These differences were highly significant (one-way ANOVA for technical efficiency: F = 256.3, p<0.001; for economic efficiency: F = 198.7, p<0.001). Notably, all three clusters showed lower economic efficiency than technical efficiency, indicating that farms operating near the technical frontier also face allocative constraints-possibly reflecting lower quality of the input mix and loss of market value rather than simply technical underperformance.
Input use and cost structure
Annual input usage (feed, labor, biologics) and associated costs were tabulated for each farm type (Table 3).
The highest input cost, particularly for semi-commercial farms, was feed (61% of total cost), followed by labour and biologics. Because of controlled feeding, family module farms also spend more on labour for each buffalo
(Gadhvi et al., 2024).
Beyond average input use, analysis of input use efficiency reveals systematic differences. Milk production costs per liter were ₹ 13.8 for semi-commercial farms, ₹ 12.1 for family farms and ₹ 13.8 for smallholder farmers-which unexpectedly shows that semi-commercial farms do not enjoy any cost advantage based on scale in variable costs per unit production. However, fixed cost allocation changed this situation substantially: when capital depreciation and land rental value were included, the economic cost per liter increased to ₹ 18.6 (semi-commercial), ₹ 18.9 (family farms) and ₹ 24.1 (small farmers; p<0.001). This suggests that economic inefficiency in smallholders arises not from wastage of technical inputs, but from the burden of high fixed costs relative to production volume.
Output performance
Annual yield, sale price and gross income are shown in Table 4. Milk alone brought in over ₹ 2.8 lakh a year for semi-commercial farms. Due in significant part to reduced yields and local market constraints, smallholders made an average of ₹ 41,800 less than anticipated for the same input.
Technical, allocative and economic efficiency scores (DEA/SFA)
Technical efficiency (TE), allocative efficiency (AE) and economic efficiency (EE) were computed using PIM-DEA; SFA provided mean inefficiency indices (Table 5). With an economic efficiency score of 0.52 as opposed to 0.87 for semi-commercial farms, smallholder efficiency clearly trailed behind. SFA supported DEA findings by demon-strating the least amount of inefficiency in semi-commercials. The estimated efficiency scores were found statistically significant. It was confirmed using Kruskal-Wallis test and p-values. Post-hoc pairwise estimations, indicated the differed significance of small holders from the other two categories (TE: p<0.001; EE: p<0.001). Average efficiency scores mask significant variation within categories. Smallholder farmers’ technical efficiency scores ranged from 0.42 to 0.92 (95% CI: 0.69-0.73), indicating that some farmers are close to excellent performance, while others have considerable room for improvement. In contrast, the range of efficiency in semi-commercial farms was narrower (0.87-0.97; 95% CI: 0.93-0.95), indicating more consistent management practices and more closely clustering around best practices.
Fig 4 displays a DEA efficiency frontier plot that illustrates the relative positions of family module, smallholder and semi-commercial buffalo dairy farms in relation to the efficiency frontier.
Explanation of DEA formula application
By solving for the smallest input reduction (θ) that a farm may get while still generating at least its observed level of outputs, DEA determines the efficiency frontier through linear programming. Farms below receive values < 1, while those on the frontier receive θ = 1 (totally efficient). Farm-scale variability and imperfect market competitiveness are permitted by input-oriented VRS models.
Principal component analysis (PCA)
Principal factors driving efficiency included feed and input quality, labor management, technology use and market access (Table 6).
High labour skill and feed/forage loadings indicated their significant influence on farm productivity. The significance of important variables (feed quality, labour, technology and market access) as efficiency drivers in buffalo dairy farms is illustrated by a PCA bar plot of factor loadings (Fig 5).
Component 1 (input quality and forage management) showed the strongest loadings for forage type (0.82), forage availability (0.79) and forage cost (0.75), indicating that forage-related decisions are the primary drivers of efficiency variation across the sample. Component 2 (labor management) had strong loadings on labor experience (0.88), skilled labor adoption (0.81) and labor utilization intensity (0.74), reflecting differences in human capital deployment. Component 3 (technology use) had loadings on adoption of artificial insemination (0.85), record-keeping (0.82) and modernization of milking process (0.71), which reflects the dimension of technology adoption. These three components together explained 59% of the total variation in the combined input–output dataset, with the remaining 41% being due to location-specific factors, market access variation and unmeasured management factors.
Regional and typological differences
State-wise analysis revealed significant variation in efficiency performance. State-wise technical efficiency scores were as follows: Punjab 0.92 (n=100), Maharashtra 0.87 (n=100) andhra Pradesh 0.76 (n=100) and Uttar Pradesh 0.72 (n=100), indicating a difference of 20 percentage points (ANOVA: F = 78.4, p<0.001). Even greater variation was observed at the state level in economic efficiency, ranging from 0.85 in Punjab to 0.51 in Uttar Pradesh (F = 112.6, p<0.001). These differences persist even after controlling for farm size category, suggesting that state-level factors including extension service quality, input market development and milk marketing infrastructure significantly influence farm-level outcomes.
Significant regional differences were found by comparative analysis; smallholders in Uttar Pradesh generally scored between 0.48 and 0.62, whereas semi-commercial farms in Punjab achieved near frontier levels (economic efficiency, 0.87). Due to improved extension access, mid-sized farms in Andhra Pradesh and Maharashtra both showed higher family module efficiency (TE, 0.91).
The output price and gross income were significantly impacted by market proximity and infrastructure connectivity. Economic efficiency was directly impacted by the 8-12% higher milk prices that farms nearer urban centres were able to sell.
Cluster explanation and typology benchmarking
The classification of farms into large-, medium- and smallholder categories was confirmed by cluster analysis. Large farms had the highest record-keeping and AI adoption rates (70-85%), along with high input costs, output volumes and profit margins. Market access, input quality and tech adoption were all lowest in C3 (smallholder) clusters. In multivariate regression, cluster membership accounted for about 55% of the variation in efficiency.
Important insights into the intricate factors influencing technical and economic efficiency in Indian buffalo dairying are provided by this large-sample study. First, according to the analysis, the average smallholder farm must function at 50-60% of its potential economic efficiency, with about one-third of inputs being wasted or allocated inefficiently. The gap is mainly due to allocative inefficiencies and scale-related cost burdens rather than technical mismanagement, as even small farms operating near the technical frontier fail to optimize input mix under market constraints.
These results point to a recurring problem with resource use and imply that in order to achieve significant sector-wide improvements, policymakers and extension agents should concentrate their efforts on smallholders. Researchers studying dairy economics have frequently cited the spread of technology, particularly artificial insemination, farm record keeping and better fodder management, as being essential to boosting output
(Rhone et al., 2008; Singh et al., 2021). The PCA and cluster results of this study corroborate these assertions and provide more information: the main explanatory factors across areas were feed quality and composition, the deployment of skilled labour and the proximity to dependable metropolitan markets. The persistently low allocative efficiency across all groups suggests that even technically efficient farms suffer systematic disadvantages in feed cost management and labor assessment due to market segmentation and incomplete information.
The importance of aggregation in agricultural operations was highlighted by the significantly higher returns to scale and input specialization experienced by semi-commercial farms, which were frequently better organized and had more reliable market access. Geographic heterogeneity is highlighted by state-to-state comparisons. For example, the huge, technologically complex farms in Maharashtra and Punjab show how extensive regional extension, infrastructure and input supply chains can produce results that are almost ideal. Due in large part to their isolated markets and slower adoption of contemporary methods, Uttar Pradesh and Andhra Pradesh, on the other hand, show far wider disparities between technical and economic efficiency
(Dwivedi et al., 2024; Sapkal et al., 2025; Singh et al., 2025).
These variations at the state level reflect differences in intensity of extension service, milk marketing infrastructure development and maturity of input supply chains rather than underlying differences in farmers’ capacity. The potential for advantages in surplus efficiency and the limitations through contextual obstacles were illustrated by the joint use of DEA and SFA. The models for input- output optimization of farms as frontier group was proposed by DEA whereas guidance for targeted improvements through continual evaluation of inefficiencies is offered by SFA. However, as per the efficiency scores, improving farm performance demands controlling physical and sociocultural issues including financing availability, transportation,
etc., along with the modifying inputs resources. The results demonstrate a concrete examination pattern for researchers, planners and legislators. Despite this, it is possible to provide targeted suggestions by clearly distinguishing between farmer groups and the factors affecting them: instead of focusing solely on increasing total yield, extension workers should actively assist in improving the quality of feed, training skilled workers and enhancing market linkages. Similarly, maintaining records and promoting the use of AI (artificial intelligence) can be prepared as a practical two-step intervention, which has the potential to increase productivity by 15% to 20%. However, due to cross-sectional design, it becomes difficult to determine the relationships between cause and effect; because the differences in efficiency among various types of farms and geographical regions can be obscured by genetic changes in livestock and quality variations in milk, which are not measured.