ABSTRACT Drug-induced phospholipidosis (PL) is a condition characterized by the accumulation of phospholipids and drug in lysosomes, and is found in a variety of tissue types. PL is frequently manifested in preclinical studies and may delay or prevent the development of pharmaceuticals. This report describes the construction of a database of PL findings in a variety of animal species and its use as a training data set for computational toxicology software. PL data and chemical structures were compiled from the published literature, existing pharmaceutical databases, and Food and Drug Administration (FDA) internal reports yielding a total of 583 compounds suitable for modeling. The database contained 190 (33%) positive drugs and 393 (77%) negative drugs, of which 39 were electron microscopy-confirmed negative compounds and 354 were classified as negatives due to the absence of positive reported data. Of the 190 positive findings, 76 were electron microscopy confirmed and 114 were considered positive based on other evidence. Quantitative structure-activity relationship (QSAR) models were constructed using two commercially available software programs, MC4PC and MDL-QSAR, and internal cross-validation (10 x 10%) experiments were performed to assess their predictive performance. Performance parameters for the MC4PC model were specificity 92%, sensitivity 50%, concordance 78%, positive predictivity 76%, and negative predictivity 78%. For MDL-QSAR, predictive performance was similar: specificity 80%, sensitivity 76%, concordance 79%, positive predictivity 65%, and negative predictivity 87%. By combining the output of the two QSAR programs, the overall predictive performance was vastly improved and sensitivity could be optimized to 81% without significant loss of specificity (79%). Many of the structural alerts and significant molecular descriptors obtained from the QSAR software were found to be associated with parts of active molecules known for their cationic amphiphilic drug (CAD) properties supporting the hypothesis that the endpoint of PL is statistically correlated with chemical structure. QSAR models can be useful tools for screening drug candidate molecules for potential PL.