Understanding how the human brain is able to efficiently perceive and understand a visual scene is still a field of ongoing research. Although many studies have focused on the design and optimization of neural networks to solve visual recognition tasks, most of them either lack neurobiologically plausible learning rules or decision-making processes. Here we present a large-scale model of a hierarchical spiking neural network (SNN) that integrates a low-level memory encoding mechanism with a higher-level decision process to perform a visual classification task in real-time. The model consists of Izhikevich neurons and conductance-based synapses for realistic approximation of neuronal dynamics, a spike-timing-dependent plasticity (STDP) synaptic learning rule with additional synaptic dynamics for memory encoding, and an accumulator model for memory retrieval and categorization. The full network, which comprised 71,026 neurons and approximately 133 million synapses, ran in real-time on a single off-the-shelf graphics processing unit (GPU). The network was constructed on a publicly available SNN simulator that supports general-purpose neuromorphic computer chips. The network achieved 92% correct classifications on MNIST in 100 rounds of random sub-sampling, which is comparable to other SNN approaches and provides a conservative and reliable performance metric. Additionally, the model correctly predicted reaction times from psychophysical experiments. Because of the scalability of the approach and its neurobiological fidelity, the current model can be extended to an efficient neuromorphic implementation that supports more generalized object recognition and decision-making architectures found in the brain.