added broadcasted() flag for tensor, added broadcast to mult, refactor gpu traverse unary
6 files changed