The water-suppression scheme is not specified.
The most important experimental parameter that determines the quality of water suppression is the homogeneity of the magnetic field. Therefore, shims should be adjusted. However, even if your 1D looks pretty good your 2D TOCSY may still have extensive residual water, due to two reasons - incomplete saturation due to poor line shape or inadequate power, or due to relaxation.
Relaxation may occur either during the evolution period, the mixing period or during the detection period. Water suppression schemes such as watergate are particularly effective because the solvent suppression occurs just before the detection period. However, the high Q probes at high magnetic field can lead to rapid relaxation due to radiation damping - this can be significant even during the time required to apply the selective pulse in the watergate, i.e., relaxation occurs even while the selective pulse is being applied (selective pulses can be designed that take radiation damping into account).
The horizontal artifact in your spectrum is not symmetric. This suggests the possibility of incorrect choice of irradiation frequency/ carrier frequency. If TOCSY is used with standard parameter set there is a slight increase in the temperature of the sample AFTER the beginning of the experiment, due to conversion of Rf energy into heat. This gives rise to a temperature gradient. The chemical shift of water is very sensitive to temperature, therefore, the absorption maximum is not the same for all water molecules in the sample. If this is the reason for the artifact, then the artifact may be reduced in the following way. Start the 2D experiment. After a steady state of thermal gradients is produced (about 5-10 minutes), stop the experiment. Quickly adjust shims and carrier frequency for optimum water suppression (preferably checking suppression with the TOCSY sequence with the incrementable delay set to a non-zero value). Then restart the 2D with the standard parameters, preferably using a large number of dummy scans for the first increment.