There has been an increased interest in optimization for the analysis of large-scale data sets which require gigabytes or terabytes of data to be stored. A variety of applications originate from the fields of signal processing, machine learning and statistics. Large-scale problems impose restrictions on methods that have been so far employed. The new methods have to be memory efficient and ideally, within seconds they should offer noticeable progress towards a solution. First-order methods meet some of these requirements. They avoid matrix factorizations, they have low memory requirements, additionally, they sometimes offer fast progress in the initial stages of optimization. Unfortunately, as demonstrated by numerical experiments in this thesis, first-order methods miss essential information about the conditioning of the problems, which might result in slow practical convergence. The main advantage of first-order methods which is to rely only on simple gradient or coordinate updates becomes their essential weakness. We do not think this inherent weakness of first-order methods can be remedied. For this reason, the present thesis aims at the development and implementation of inexpensive higher-order methods for large-scale problems.