In Section 4.18 on neural networks, note that it is customary to
scale the data before calling neuralnet() or analogous function
in other packages. This custom did not arise from theoretical
considerations, but rather because many users found that the iteration
processes failed to converge without scaling. (This rather ad
hoc, "folklore" approach is typical of machine learning methods.)
The types of scaling used seem to vary, ranging from use of the R
scale() function to simply linearly transforming data to [0,1] or
[-1,1]. In the regression case, the "Y" variable (response) is scaled
as well.