The aim of this study is to speed up the scaled conjugate gradient (SCG) algorithm by shortening the training time per iteration. The SCG algorithm, which is a supervised learning algorithm for network-based methods, is generally used to solve large-scale problems. It is well known that SCG computes the second-order information from the two first-order gradients of the parameters by using all the training datasets. In this case, the computation cost of the SCG algorithm per iteration is more expensive for large-scale problems. In this study, one of the first-order gradients is estimated from the previously calculated gradients without using the training dataset. To estimate this gradient, a least square error estimator is applied. The estimation complexity of the gradient is much smaller than the computation complexity of the gradient for large-scale problems, because the gradient estimation is independent of the size of dataset. The proposed algorithm is applied to the neuro-fuzzy classifier and the neural network training. The theoretical basis for the algorithm is provided, and its performance is illustrated by its application to several examples in which it is compared with several training algorithms and well-known datasets. The empirical results indicate that the proposed algorithm is quicker per iteration time than the SCG. The algorithm decreases the training time by 20-50% compared to SCG; moreover, the convergence rate of the proposed algorithm is similar to SCG.